arX1v:2004.07699v1 [cs.RO] 16 Apr 2020 


Predictive Whole-Body Control of 
Humanoid Robot Locomotion 


Stefano Dafarra 


[ibris 


Dipartimento 

di Informatica, 
Bioingegneria, 
ISTITUTO ITALIANO Robotica e 

DI TECNOLOGIA Ingegneria dei Sistemi 


Supervisors: Daniele Pucci, Giorgio Metta 


Jury Members and Reviewers* 


Rachid Alami Senior Scientist at LAAS-CNRS, Toulouse 
Antonio Franchi Associate Professor at University of Twente 
Robert Griffin* Research Scientist at IHMC, Pensacola 


Ludovic Righetti Associate Professor at New York University 
Olivier Stasse* Senior Researcher at LAAS-CNRS, Toulouse 


Fondazione Istituto Italiano di Tecnologia 
Dynamic Interaction Control Lab 


Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei 
Sistemi, Università di Genova 


Genova, Italy 
2020 


Abstract 


Humanoid robots are machines built with an anthropomorphic shape. Despite 
decades of research into the subject, it is still challenging to tackle the 
robot locomotion problem from an algorithmic point of view. For example, 
these machines cannot achieve a constant forward body movement without 
exploiting contacts with the environment. The reactive forces resulting from 
the contacts are subject to strong limitations, complicating the design of 
control laws. As a consequence, the generation of humanoid motions requires 
to exploit fully the mathematical model of the robot in contact with the 
environment or to resort to approximations of it. 

This thesis investigates predictive and optimal control techniques for 
tackling humanoid robot motion tasks. They generate control input values 
from the system model and objectives, often transposed as cost function to 
minimize. In particular, this thesis tackles several aspects of the humanoid 
robot locomotion problem in a crescendo of complexity. First, we consider 
the single step push recovery problem. Namely, we aim at maintaining the 
upright posture with a single step after a strong external disturbance. Second, 
we generate and stabilize walking motions. In addition, we adopt predictive 
techniques to perform more dynamic motions, like large step-ups. 

The above-mentioned applications make use of different simplifications or 
assumptions to facilitate the tractability of the corresponding motion tasks. 
Moreover, they consider first the foot placements and only afterward how to 
maintain balance. We attempt to remove all these simplifications. We model 
the robot in contact with the environment explicitly, comparing different 
methods. In addition, we are able to obtain whole-body walking trajectories 
automatically by only specifying the desired motion velocity and a moving 
reference on the ground. We exploit the contacts with the walking surface 
to achieve these objectives while maintaining the robot balanced. 

Experiments are performed on real and simulated humanoid robots, like 
the Atlas and the iCub humanoid robots. 
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Prologue 


As the births of living creatures, at first, are ill-shapen: so are all Innovations, 
which are the births of time. 


Francis Bacon 


The term robot has been coined by Czech writer Karel Capek to describe 
creatures with a human-like appearance, but used for tedious work. Nowadays, 
humanoid robotics focuses on the study and development of machines where 
the resemblance to humans extends to the ability of sensing, reasoning and 
acting. The research efforts aim not only at substituting tedious tasks, but 
rather to have machines able to replace humans in a multitude of scenarios, 
including disaster situations. In 2015, the DARPA Robotics Challenge (DRC) 
took place in Pomona, California, to test 23 humanoid robots in a scenario 
representative of a nuclear disaster. The tasks to be performed involved, for 
example, driving a utility vehicle, walking through a door or manipulate a 
tool to cut a hole in a wall. All the robots were completely untethered, while 
teams were not allowed to see their robot directly. Many of the biped robots 
incurred in a fall during the trials [Guizzo and Ackerman, 2015], emphasizing 
the unripeness of this technology. 

More than 40 years after the first moving humanoid robot, walking still 
proves to be a challenging task for humanoid robots. Recent developments 
of optimal control techniques have largely increased their applicability to 
robotic applications such as humanoid locomotion. Optimal control provides 
a high level of abstraction where control actions are automatically obtained 
starting from the model of the system under control and a mathematical 
description of the task to be achieved. In addition, the availability of a 
prediction window allows performing anticipatory actions which may not be 


available to classic feedback control laws. This proves to be useful when a 
humanoid robot has to perform a step, for example. This motion requires it 
to instantiate a stable contact with the environment while absorbing eventual 
external disturbances. Hence, the mechanical structure behaves differently 
before and after the impact, requiring the control law to be robust to this 
critical transition. 

In other cases, the external disturbance is the result of an interaction with 
a human. Indeed, humanoid robots are meant to share their environment 
with humans, rather than being constricted behind safety fences. The 
physical human-robot interaction poses new challenges with social and ethical 
implications. When performing dynamic tasks, the robot needs to absorb 
impacts through intrinsic compliance while maintaining balance. As a 
consequence, it is necessary to consider the full dynamic model of the 
robot. Nevertheless, when planning motions, it may be necessary to adopt 
assumptions and simplifications to ease the tractability of the problem at 
the cost of losing descriptiveness of the model under consideration. 

In this thesis, we apply optimal control techniques to humanoid robot 
locomotion with a focus on compliance. We adopt a spectrum of models 
ranging from simplified to complete, presenting a series of applications to 
real robotic platforms. In particular, we explore different choices of model 
depending on the specific context. We finally study the implications of using 
a kinematically and dynamically detailed model. This research work has 
been carried out during my Ph.D within the Dynamic Interaction Control 
laboratory of the Italian Institute of Technology in Genova, Italy. My Ph.D. 
secondment has been carried out at the Robotics Lab within the Institute of 
Human and Machine Cognition located in Pensacola, Florida. 

This thesis is divided into three parts. 


e Part I provides the reader with a small background about the concepts 
exploited in the thesis. 


o Chapter 1 introduces the thesis with few historical curiosities. It 
also presents the robots adopted as implementation platforms for 
the concepts presented in the following parts. 


o Chapter 2 introduces a narrow set of optimal control techniques 
and their terminology. They are presented assuming a generic 
system under control. 


o Chapter 3 describes the modeling tools in case the system under 
control is a humanoid robot. 


o Chapter 4 presents a small state of the art overview, defining the 
thesis context. 


e In Part II we exploit the prediction capabilities of optimal control 
techniques to perform dynamic motions, especially those involving 
activation and deactivation of contacts. 


o Chapter 5 presents a technique which leverages the prediction 
capabilities of the controller to maintain balance by stepping after 
a large push. It uses an approximated model to generate a set of 
desired contact forces. Results are shown in simulation. 


O 


Chapter 6 exploits prediction to maintain the tracking of a ref- 
erence trajectories while taking into account several constraints 
arising during each walking phase. This controller is part of a 
layered architecture which allows the iCub robot to walk both in 
position and torque mode. 


o Chapter 7 uses trajectory optimization to perform large step-ups. 
We show that these techniques can generate dynamic motions 
obtaining a consistent maximum joint torque reduction. Experi- 
ments are performed on the Atlas humanoid robot. 


e In Part III we frame the locomotion problem into the optimal control 
framework, showing that it is possible to obtain walking motions 
automatically. Indeed, locomotion is not a trivial task. Simply moving 
forward may be considered a straightforward objective. The design 
of a controller which follows directly a forward velocity reference may 
guarantee a high reward in the short term. The robot may simply lean 
forward, achieving a perfect tracking of the velocity reference. Clearly, 
this can be incompatible with the constraints imposed by contacts. 
Hence, it may be necessary to sacrifice some of the short term reward to 
achieve a better objective in the long term. For this reason, we believe 
that in case of highly constrained systems with non-trivial objectives, 
such as the locomotion of humanoids, prediction is fundamental. 


At the same time, the definition of references plays an important role. 
Instead of specifying the full path the robot is supposed to follow, our 
goal is to define only a minimum set of references, thus limiting the 
effort from the operator. 


o Chapter 8 is mainly devoted to the definition of contacts, which 
is critical for locomotion planning. It presents the model adopted 
in the non-linear optimal control problem. 


o Chapter 9 shows how we shape the cost function to obtain walking 
motions with a minimum set of references. 


o Chapter 10 validates the walking planner and presents results on 
the real robot. 
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Part I 


Background and 
Fundamentals 


All models are wrong; some models are useful. 


George E. P. Box 


Introduction 


In the June of 1696, Johann Bernoulli, professor of mathematics at the 
University of Groningen, posed the following challenge in the issue of the 
Acta Eruditorum journal [Sussmann and Willems, 2002]: 


If in a vertical plane two points A and B are given, then it is 
required to specify the orbit AMB of the movable point M, along 
which it, starting from A, and under the influence of its own 
weight, arrives at B in the shortest possible time. [.../ 


This is known as the Brachystochrone problem, from the Greek “shortest 
time”. Johann himself, Leibniz, Johann’s elder brother Jakob, Tschirnhaus, 
l’ Hopital and Newton (even if in anonymous form) provided a solution to this 
problem, indicating the cycloid to be the trajectory with minimum time. We 
show an example of this curve in Fig. 1.1. In particular, Johann Bernoulli 
found this solution using the Fermat’s principle of least time [Erlichson, 1999]. 
The cycloid is the curve drawn by a point fixed to the outer edge of a circle 
“rolling” without slipping on a horizontal line, as depicted in Fig. 1.2. 

Even if the solution found by Bernoulli exploited a method well known 
in optics, the formulation of the challenge corresponds to an optimal control 
problem. Given the system dynamics, a ball moving under the effect of its 
own weight, the goal is to transfer its state between two points minimizing 
a metric, the total duration. Hence, 1696 can be considered as the year of 
birth of optimal control [Sussmann and Willems, 1997]. In addition, the 
Brachystochrone problem is inherently related to locomotion (from the Latin 
moving from a place) since it involves the motion of a body. 

Leonhard Euler, who became a student of Johann Bernoulli at the age of 
13, published in 1744 a book entitled The Method of Finding Plane Curves 


Linear: 0.769s Circle: 0.599s Parabola: 0.605s 


— Cycloid: 0.578s 


Fig. 1.1 Different trajectories of a point with unit mass moving from (0,0) 
to (1.0, -0.4) according to the brachystochrone problem. The duration of 
each trajectory is shown in the legend. The “Circle” and “Parabola” are 
computed such that the tangent in the first point is vertical, thus having the 
highest initial acceleration. 


Fig. 1.2 Graphical representation of the circle generating the cycloid of Fig. 
1.1. It rolls counterclockwise along the black dashed line performing a full 
turn. The blue point, rigidly attached to the circle outer edge, draws the 
cycloid. The red point is the end point of the trajectories in Fig. 1.1. 
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(b) 


Fig. 1.3 (a) A picture of the Unimate first prototype (picture downloaded from 
www.robotics.org/joseph-engelberger/about.cfm). (b) The WABOT- 
1 humanoid robot (picture from www.humanoid.waseda.ac.jp/booklet/ 
kato_2.htm1). 


that Show Some Property of Maximum and Minimum, providing the founda- 
tions of what is known as Euler’s equation. Eleven years later, a 19-year-old 
Joseph-Louis Lagrange wrote Euler a brief letter describing a revolutionary 
idea which refined Euler’s method. Lagrange derived a necessary condition 
for optimality, known today as the Euler-Lagrange equation. 

The Euler-Lagrange equation is ubiquitous in robotics too. It allows 
deriving the equation of motions governing a multi-body system. Hence, 
the work by Lagrange defines a particular common ground between optimal 
control and robotics, exploited in the second part of the 20th century. In 
fact, the first industrial robot, the Unimate #001 shown in Fig. 1.3a, was 
developed in 1959, while the first machine capable of human-like locomotion 
appeared in 1972. Its name is WABOT-1 (Fig. 1.3b) and it is the result 
of the WABOT project [Kato et al., 1974] from the Waseda University 
started in 1967. It weighs approximately 130kg and it moves thanks to 11 
hydraulically powered joints. Walking is achieved by splitting the motion 
in different phases and by storing the corresponding joint references in the 
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Fig. 1.4 (a) A wire-frame view of the iCub robot with some of the electronics 
in transparency. (b) A picture of the robot fully untethered while walking 
during a tele-operation experiment. 


“mini-computer” equipped on the robot. An analog circuit tries to match the 
desired joint value with the one measured by a potentiometer. 

Nowadays, humanoid locomotion is still a challenging problem. One of 
the difficulties is the lack of mathematical tools able to formally describe 
“balanced” and “falling” states of a mechanical structure with limbs. Indeed, 
some of the most successful conditions are drawn under strong assumptions 
and simplifications [Pratt et al., 2006, Vukobratovié and Borovac, 2004]. In 
addition, walking implies exploiting contacts with the environments. The 
mathematical description of a mechanical system with impacts is an active 
research field [Olejnik et al., 2017, Azhmyakov, 2019, Rijnen et al., 2019]. 

In this context, the abstraction and prediction capabilities of optimal 
control techniques could facilitate the development of algorithms able to 
maintain the balance of complicated machines like humanoids, while achieving 
locomotion tasks. The next sections will present the two research platforms 
exploited in this thesis: the iCub and the Atlas humanoid robots. 
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1.1 The iCub humanoid robot 


The iCub humanoid robot is a state-of-the-art open-source robotic platform 
developed at the Italian Institute of Technology as part of the European 
project RobotCub [Metta et al., 2005, Natale et al., 2017]. More than 40 of 
these robots have been built in different versions and distributed in several 
laboratories worldwide. 


1.1.1 Hardware 


The iCub humanoid robot, depicted in Fig. 1.4, is 104 cm tall, weighing 33 
kg. It possesses in total 53 degrees of freedom including those in the hands 
and in the eyes. For locomotion purposes, only 23 joints are employed and 
they are distributed in the following way: 


e 6 joints in each leg, 
e 3 joints in the torso, 
e 4 joints in the arm, 3 of which in the shoulder and one in the elbow. 


The torso and shoulder joints are mechanically coupled and driven by 
tendon mechanisms. All 23 joints are powered by brushless electric motors 
equipped with Harmonic Drive transmissions with a reduction ratio of 1/100. 

The robot is powered either by an external supplier or by a custom made 
battery. In the first case, the operating continuous voltage is 40V with about 
3A current, depending also on the task executed by the robot. The battery, 
instead, provides 36VDC and it has a capacity of 9.3Ah, which allows about 
45 minutes of continuous usage. 

A particular feature of iCub is the vast array of sensors available, that 
includes force/torque sensors, accelerometers, gyroscopes, a distributed tactile 
skin, two VGA cameras and microphones. More in detail, iCub possesses 6 
six-axes force/torque (F/T) sensors [Fumagalli et al., 2012]. In particular, 
two of them are mounted at the shoulder, the other four respectively at 
the hips and ankles. iCub also possesses distributed tactile sensors as an 
artificial skin [Cannata et al., 2008, Maiolino et al., 2013], which provides 
informations about both the location and the intensity of the contact forces. 
Notably, the skin is distributed on the torso, arms, the palm and fingertips 
of hands and legs. Because iCub is not endowed with joint torque sensors, 
the F/T sensors and the skin are used to provide an estimation of both the 
internal torques and external forces [Fumagalli et al., 2012]. Two inertial 
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Fig. 1.5 (a) Lateral and front view of the iCub left leg mechanics. (b) Bottom 
view of the iCub left foot with its dimensions in millimeters. The right foot 
is specular about the longitudinal axis. 


sensors are mounted on the robot: one on the head an the other on the waist. 
They provide acceleration and angular velocity information. 

All the motors and sensors are controlled through more than 30 electronic 
boards spread on the body, as sketched in Fig. 1.4a. They are connected 
trough an Ethernet network (CAN in previous versions or in some particular 
boards) in daisy chain. Fig. 1.5a presents the mechanical structure of the left 
leg and some of the motor control boards. An on-board computer, located in 
the robot head, provides a bridge between the internal network and external 
computers. It is equipped with a 4!” generation Intel® Core i7@1.7GHz 
and 8GB of RAM running Ubuntu. The connection to the robot can be 
established through an Ethernet cable or wirelessly thanks to a standard 
5GHz Wi-Fi network. Lastly, Fig. 1.5b shows the robot particular foot shape 
and its dimensions. 


1.1.2 Software infrastructure 


To control the robot, an infrastructure of computers is necessary. To this 
purpose, the software Yet Another Robot Platform (YARP) [Metta et al., 
2006] is employed. It is a software middleware whose main purpose is to allow 
seamless communication between “applications”, which can reside on different 
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Fig. 1.6 The iCub humanoid robot in the Gazebo environment. 


computers. Furthermore, it provides interfaces to interact with physical 
devices independently of the actual implementation, thus facilitating code 
reuse and modularity. The acquisitions from sensors and motor controllers 
are provided through YARP interfaces. 

Apart of the YARP middleware, an additional software layer is available, 
called iDynTree [Nori et al., 2015]. Its objective is to simplify the writing 
of whole-body controllers. It wraps dynamic computations such as the 
estimation of the mass matrix or inverse dynamics. The interface has 
been coded in C++. Considerably, these interfaces can also be accessed 
in MATLAB® functions, or Simulink® projects through the use of WB- 
Toolbox library[Romano et al., 2017a]. Indeed, the ease of developing a 
control system with Simulink, together with the possibility of exploiting 
simple debugging tools and existing toolboxes, makes possible the rapid 
prototyping of controllers. 

Controller prototyping also exploits the Gazebo simulation environment 
[Koenig and Howard, 2004]. It is an open-source simulator for multi-body 
systems able to detect and handle contacts through a “compliant” model. It 
is highly flexible. A simulated version of the robot, shown in Fig. 1.6, can 
be controlled through the corresponding Yarp plugins [Mingo et al., 2014] 
allowing to apply the very same controller on both the simulator and the 
real robot, thus making the actual implementation transparent. 
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Fig. 1.7 The IHMC Atlas humanoid robot performing a complex walking 
task on top of a narrow cinder block pile. 


1.2 The IHMC Atlas humanoid robot 


The author had the honor and the pleasure of testing some of the algorithms 
presented in this thesis on the Atlas humanoid robot granted to the Institute 
of Human and Machine Cognition (IHMC). The IHMC team, located in 
Pensacola, Florida, won the first place at the DARPA Virtual Robotics 
Challenge [Koolen et al., 2013] and the second place at the DARPA Robotics 
Challenge Trials and Finals [Johnson et al., 2015, 2017]. 

The robot has been designed and built by Boston Dynamics®. Neverthe- 
less, the experimental results obtained with this platform have been possible 
thanks also to the software infrastructure developed by the Robotics Lab at 
IHMC. Hence, in the following we refer to this robot as IHMC Atlas. 


1.2.1 Hardware 


The Atlas humanoid robot, showed in Fig. 1.7, is a hydraulically powered 
humanoid robot designed by Boston Dynamics®. It exploits an electric 
pump and an accumulator located slightly above the backpack, to maintain 
a hydraulic fluid at a desired pressure of 2300 PSI. It is possible to control 
the motion of hydraulic linear actuators located at each joint by controlling 
the opening of hydraulic valves. The forearms are an exception since they 
are controlled by electric motors. They are visible in Fig. 1.7 thanks to their 
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Fig. 1.8 A close-up view of the Atlas legs, also showing the KVH® IMU 
located on the pelvis. 


gray color, different from the rest of the arms. The robot has 30 degrees 
of freedom with six in each leg, seven in the arms, one in the neck, and an 
additional three in the pelvis. Fig. 1.8 focuses on the leg mechanics, with 
a close-up view of the hip roll and pitch joints and on the full leg before 
performing a large step-up. The force applied by each hydraulic actuator is 
measured through pressure sensors in the valve chambers. This measurement 
and a joint acceleration feed-forward term are used to implement joint torque 
control [Koolen et al., 2016a]. Indeed, hydraulic actuation is affected by 
high friction due to the viscous fluid flowing in the valves chambers and to 
the seals. This feed-forward term and a mechanism to compensate for joint 
backlash [Koolen et al., 2016a] allow the implementation of force control at 
the robot feet, which is fundamental for walking. Leg joint positions are 
measured through the elongation of the hydraulic actuators. 

Atlas weighs 150kg and stands 1.88m tall. It requires 480V three-phase 
voltage provided offboard by an extensible tether. During the DARPA 
Robotics Challenge, it worked with an on-board battery pack that permitted 
untethered operation for over an hour. A Carnegie Robotics MultiSense-SL 
head provides two forward-facing cameras and an axial rotating Hokuyo 
LIDAR. Atlas also has two wide-angle cameras intended to compensate for 
the robot not being able to yaw its head. An interchangeable end-effector 
mount terminates each arm, allowing third party hand installations. The 
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Fig. 1.9 The IHMC Atlas humanoid robot walking in the Simulation Con- 
struction Set environment. It presents many graphical elements indicating 
relevant quantities like desired foot poses, center of mass state, controller 
phase, joint torques, and others. 


robot possess a KVH® IMU with fiber optic gyros located on the pelvis, and 
load cells at the feet to form a 3-axes force-torque sensor. 

The sensor and valve electronics are connected via a CAN network. Each 
appendage has a separate CAN link connected to a main hub placed behind 
the front cover. The firmware governing these peripheral nodes is proprietary 
to Boston Dynamics®. A cluster of four embedded computers running 
Ubuntu is placed in the backpack and can be accessed through an Ethernet 
switch to communicate with the robot. 


1.2.2 Software infrastructure 


The software infrastructure strongly relies on Java [Smith et al., 2014]. The 
estimation thread and the main balancing controller run on the on-board 
computer at respectively 1kHz and 333Hz [Koolen et al., 2016a]. More 
information on the controller is presented in Sec. 7.1.1. The communication 
with these modules is performed by means of a custom Java implementation 
of ROS2 messages 1. 

The code-base contains over 3000 unit test [Johnson et al., 2017] which 
could be related to a single class or to a high-level behavior, like walking 


'See https ://github.com/ihmcrobotics/ihmc-java-ros2-communication. 
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or even opening doors. Indeed, many of these tests require simulating the 
robot interacting with the environment. This is done through the Simulation 
Construction Set (SCS) simulator?. An example of the SCS graphical user 
interface is shown in Fig. 1.9. In addition to the visualization of the robot 
in the top left box, it also allows on-line plotting of any of the relevant 
quantities used inside the code. This simulator has been fundamental in the 
development of other robots by the IHMC team [Pratt et al., 2009, Cotton 
et al., 2012]. SCS integrates the rigid body dynamics equations [Featherstone, 
2014] while using a compliant model for handling the contacts. Simulations 
are usually deterministic and hence highly reproducible. Nevertheless, it is 
possible to introduce artificial noise and disturbances in the measurements 
to mimic a real scenario. 

The SCS infrastructure can also be used to replay data logged from 
the robot during experiments. The values over time of more than 14000 
variables are stored after each run of the controller, together with up to four 
time-synced video streams coming from external cameras. This tool strongly 
simplifies the debugging process involved when performing experiments on 
the real robot. 


1.3 Notation 


Throughout the thesis we will use the following notation. 
e Vector and matrices are expressed with a bold symbol. 
e The ip component of a vector x is denoted as zi. 
e The transpose operator is denoted by (-)'. 


e The superscript (-)* indicates desired values. 


e Given a function of time f(t) the dot notation denotes the time deriva- 
tive, ie. f := a Higher order derivatives are denoted with a corre- 
sponding amount of dots. 


e Given a function f, we define with V2 f(y) the partial derivative of f 
with respect to æ, evaluated in y. 


? The source code is available at the following link: https: //ihmcrobotics.github.io/ 
simulation-construction-set/docs/scshome.html. 
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T is a fixed inertial frame with respect to (w.r.t.) which the robot’s 
absolute pose is measured. Its z axis is supposed to point against 
gravity, while the x direction defines the forward direction. 


1, E€ R"*” denotes the identity matrix of dimension n. 


Onxn E€ R”*” denotes a zero matrix while 0, = 0,1 is a zero column 
vector of size n. 


ei is the canonical base in R”, i.e. e; = [0,0,...,1,0,...,0]' € R”, 
where the only unitary element is in position 7. Throughout the thesis, 
n will be either 2 or 3 depending on the context. 


The operator A defines the skew-symmetric operation associated with 
the cross product in R?. Its inverse is the operator vee V. 


The weighted L2-norm of a vector v € R” is denoted by ||v||w, where 
W e R”*” is a weight matrix. 


ARB € SO(3) and 4Hpg € SE(3) denote the rotation and transforma- 
tion matrices which transform a vector expressed in the B frame, P? x, 
into a vector expressed in the A frame, 4a. 


D VAD € Rô is the relative velocity between frame A and D, whose 
coordinates are expressed in frame D. 


zcom € R? is the position of the center of mass with respect to Z. 


R2(A) € SO(2) is the rotation matrix of an angle 0 € R; S2 = Ro(m/2) 
is the unitary skew-symmetric matrix. 


n(-),R? — R? is a function returning the direction normal to the 
walking plane given the argument x and y coordinates. 


t(-), R3 > R3*? is a function returning two perpendicular directions 
normal to n(-). The composition of t(-) and n(-), [t(-) n(-)], defines 
the rotation matrix 7 Rylane- 


h(p), R3 — R defines the distance between p and the walking surface. 


diag(-),R” > R”*” is a function casting the argument into the corre- 
sponding diagonal function. 
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Optimal Control and Non-Linear 
Optimization Basics 


In this chapter, we present the basics and terminology of optimal control 
and non-linear optimization. In particular, Sec. 2.1 defines the elements 
composing an optimal control problem, while Sections 2.2 and 2.3 introduce 
the methods used to solve it. Sec. 2.4 presents the receding horizon principle, 
which is the base of the main predictive algorithm used in this thesis. Finally, 
Sec. 2.5 provides a background on non-linear optimization, used as a tool 
for the solution of optimal control problems. 


2.1 Optimal control basics 


Optimal control allows achieving a specified objective while minimizing a 
metric indicating the performances of the control actions. Compared to 
traditional techniques, there is a paradigm shift: the designer has not to 
define a control law, but rather to describe the system under control, and 
the objective to be achieved, in mathematical terms. The tuning process 
moves from the definition of controller gains to the weights describing the 
relative importance of each objective. The resulting control law is obtained 
by applying methods presented in the following. 
Consider a generic dynamical system 


t = f (x(t) u(t), t), (2.1) 


where its time evolution is a function f : R” x R™ x R > R” depending 
on the state æ € R”, the control variables u € R” and time. Eq. (2.1) is 
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an Ordinary Differential Equation (ODE). The problem of determining the 
evolution of the system, given an initial value x(to) = Xo, is called Initial 
Value Problem (IVP). On the contrary, if the terminal condition is specified, 
i.e. (T) = £r, we have a Boundary Value Problem (BVP). 

Often, a simple ODE may not be sufficient to model complex dynamical 
systems. Indeed, it may be necessary to introduce algebraic constraints, like 


gi<g (x(t), w(t), t) < gu. (2.2) 


g : R” x R™ x R > R” is a generic constraint function of dimension ng, 
bounded by gl € R’ and gu € Rs. When gl and gu coincide, Eq. (2.2) is an 
equality constraint. Then, together with Eq. (2.1), they define a Differential 
Algebraic Equation (DAE). On the contrary, when the bounds do not coincide, 
the set of equations (2.1)—(2.2) is defined as inequality constrained DAE. 
The bounds on the control inputs, which are usually present in physical 
systems, represent a common example of constraint. 

The DAE consisting of Eq.s (2.1) and (2.2) represents a first ingredient of 
an optimal control problem, defining a mathematical description of the system 
under control. In addition, the initial condition æo is also specified (e.g. 
feedback from the system). The second ingredient is the metric measuring 
how well the control law is performing according to the designer’s objective, 
i.e. the cost function. Consider applying the optimal controller in a fixed 
time window, i.e. t € [to, T]. Then, the cost function is the following: 


T 
J (eo, u(-),t) = m (#(T)) + / t (ae(r),u(r)) dr. (2.3) 
to 

The function m : R” —> R is called Mayer term and weights the terminal 
state, while the function £ : R” x R™ x R > R is the Lagrange term. The 
former allows specifying the goal the system has to reach in the terminal 
time, while the latter defines the preferred path to reach such goal. 

Finally, an optimal control problem minimizes Eq. (2.3) through a control 
policy u(-) in the interval [to, T], i.e. u[to, T]. It can be written as: 


T 
minimize J (æo, u(-),t) = m (æ(T)) +f £ (æ(T),u(T)) dr (2.4a) 


ulto,T] to 
subject. to: 
t=f (x(t), u(t), t) : (2.4b) 
x(to) = £o, (2.4c) 
gl < g (a(t), u(t),t) < gu. (2.4d) 
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Techniques for solving the optimal control problem described in Eq. (2.4) 
can be classified as either direct or indirect methods. 


2.2 Indirect methods 


Indirect methods attempt to find a solution to Eq. (2.4) by exploiting 
conditions for optimality. Historically, these have been developed considering 
no constraints, i.e. Ng = 0. 

Denote the optimal cost-to-go J* (x(t), t) as the following: 


T 

T*(e(t), t) = oe (x(T)) +f ¢(x(r), u(T)) dr. (2.5) 
ult, t 

Notice that J* (x(t), t) depends on a(t) but not on the state evolution up to 

state t. In this regard, let us introduce the Bellman principle of optimality 

[Bellman, 1957, Chap. 3.3]: 


An optimal policy has the property that whatever the initial 
state and initial decision are, the remaining decisions must 
constitute an optimal policy with regard to the state resulting 
from the first decision. 


Consider an instant tı : t < tı < T. If the optimal control policy has 
been applied in the interval from tı to T, the optimal cost-to-go starting 
from t is obtained by minimizing the sum of the cost from t to tı, plus the 
optimal cost from tı to T. In fact, we can split Eq. (2.5) as follows: 


J* (x(t), t) = int i : l (x(7), u(7)) dr+ 


(2.6) 
T 
+ inf [a (x(T)) +f L (x(7), u(7)) ar 


ult,tı] ty 


Then, the second part corresponds to the optimal cost-to-go starting from tı 
up to T. Thus, the Bellman principle of optimality implies: 


J*(x(t),t) = n i 4 (x(r), w(r)) dr + Fet) : (2.7) 


Consider tı = t + dt. Then, by employing the mean value theorem, there 
exists a constant a € [0,1] such that 


F*(w(t),0)= int, fe (w(t + adt), u(t + adt)) dt+7*(æ(t + dt), t + dt) } l 
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Under regularity assumptions for the cost-to-go function, we can obtain 
J*(a(t + dt),t+ dt) through Taylor expansion: 

_ OF* (x(t), t) Ox(t) ae 

i Ox Ot i (2.8) 
dt + O(dt)?. 


I (x(t + dt),t+dt)= J*(x(t),t) 
OT* (x(t), t) 
+8 
This relation can be substituted into the optimal cost-to-go formula, obtaining 


TO= int, fe (x(t + adt), u(t + adt)) dt + J*(æ(t), t)+ 


ult 
_OT* (x(t), t) Ax(t) OT* (x(t), t) 
a ae 


First we simplify and then we divide by dt: 


(2.9) 


dt + ora? 


0= inf f (x(t + adt), u(t + adt)) 4 


ujt i+dt] 
pe UA 0) + oan} l 


Notice that Aaa does not depend on the control input. In addition, the 
cost-to-go in the terminal instant T corresponds to the Mayer term. Finally, 
by letting dt > 0, we obtain 


oem Tan fe (w(t), u(t)) + 


I*(æ(T),T) =m (æ(T)). 

Eq. (2.10) is the Hamilton-Jacobi-Bellman(HJB) equation of optimal control 
[Kirk, 2012, Sec. 3.11]. It represents a necessary condition for a control 
policy to be optimal. Under regularity assumption for J*, it is also sufficient. 

Indirect methods attempts to find the optimal control policy starting from 
optimality conditions like the HJB. Hence, indirect methods require the user 
to solve the partial differential equation defined in Eq. (2.10), limiting the 
applicability of the controller to the specific system under control. In addition, 
the introduction of constraints (especially inequalities) is not straightforward. 
Finally, these methods may lack robustness [Betts, 2010, Sec. 4.3]. Small 
changes to the boundary conditions may result in very different policies. 

Notably, in case of unconstrained linear time-invariant systems subject 
to quadratic costs, Eq. (2.10) can be solved analytically obtaining the 
linear-quadratic regulator. 
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2.2.1 Linear Quadratic Regulator 


The linear quadratic regulator applies to linear time invariant (LTI) systems, 
characterized by the following relation: 


t = Az + Bu. (2.11) 


The cost function to be minimized is quadratic, corresponding to: 


T 
J= (T) Fa(T) 4 ; | (z(r) Qal) + u(r)" Ru(r)) dr, (212) 


where F',Q € R"*” are positive semi-definite matrices, while R € R™*™ is 
a positive definite matrix. They are all user-defined gain matrices. Through 
the HJB equations, we obtain this simple control law [Kirk, 2012, Sec. 3.12]: 


u(t) = —K(t)x(t), (2.13) 
where K € R™*” is a time-varying gain matrix given by: 
K(t)=R'B' P(t). (2.14) 


P(t) € R"*” is obtained by solving the continuous time Riccati differential 
equation [Bittanti et al., 2012): 


P(t) = -Q + PBR B! P(t)— P(t)A— A' P(t) (2.15) 
with the boundary condition 
P(T) =F. (2.16) 


Hence, the LQR provides the optimal gains of a state-feedback control 
law, obtained by solving a boundary value problem. 

The LQR can be easily extended to the case where the horizon is infinite, 
corresponding to the following cost function: 


J = sf (#(7)" Qa(r) + u(r)" Ru(r)) dr. (2.17) 


In this case, the optimal control law is equal to Eq. (2.13), with the difference 
that P(t) = P is constant, obtained by computing the stationary point of 
the Riccati equation: 


0=-Q+PBR'B'P-PA-A'P. (2.18) 


This is called Algeraic Riccati equation. The advantage of this approach is 
that the gain matrix is constant and can be computed off-line. 
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2.3 Direct methods 


Direct methods reformulate Eq. (2.4) as an optimization problem, thus 
directly attempting at finding the minimum of Eq. (2.3) subject to dynamics 
and algebraic constraints. A generic optimization problem is the following: 


minimize T(x) (2.19a) 
x 


subject. to: 
hl < h (x) < hU. (2.19b) 


Compared to Eq. (2.4) there are some notable differences. First of all we 
have only one set of variables x € R”x. In addition the dynamics (Eq. 
(2.4b)) or the concept of time evolution of the system are absent. This gap 
is overcome by discretizing the continuous system dynamics using numerical 
integration techniques, at the cost of inserting approximation errors. The 
discretization techniques are the same used to simulate generic ODEs. 


2.3.1 Common integration methods 


Integration methods determine an approximated value for x(t + dt) given 
the state evolution: 


x(t + dt) = v (x(t) u(t), t). (2.20) 


The integrator v : R” x R™ x R — R” is a function considering only one 
previous value, a(t). In case other previous values are also considered, e.g. 
a(t — dt), x(t — 2dt), and so on, the integration method is called multi-step. 
In this thesis we will focus on single-step integration methods only. The 
most common single-step method is the forward Euler integration scheme: 


x(t + dt) = x(t) + dtf (x(t), u(t), ¢) , (2.21) 
which can be seen as an approximation of the derivative operator, i.e. 
9 æ(t+dt) =a) 
tx 
dt 


Interestingly, it can be shown that for sufficiently large dt, this method 
can be subjected to numerical instability. In other words, depending on 
dt, it may be possible that the integrated state evolution diverges from the 
equilibrium point even if the dynamical system has a globally asymptotically 
stable equilibrium. On the contrary, the backward Euler method 


a(t + dt) = a(t) + dtf (w(t + dt), u(t + dt), t + dt) (2.23) 


, (2.22) 
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is numerically stable independently from the chosen time step [Ascher and 
Petzold, 1997, Sec. 3.5]. This is a property of implicit single step methods. 
The implicitness is due to the presence of a(t + dt) on both sides of Eq. 
(2.23), defining an algebraic loop. When using this method to simulate a 
dynamical system, it would be necessary to solve such algebraic loop at every 
integration step, usually through a Newton’s method [Betts, 2010, Sec. 1.5], 
increasing consistently the computational burden. 
Another implicit scheme is the trapezoidal method: 


a(t + dt) = a(t) + a (x(t), u(t), t) + 


(2.24) 
+ f (w(t + dt), u(t + dt), t + dt) ) 


which evaluates the dynamics at both ends of the integration step. For this 
reason, it is considered a multiple collocation method, while Euler schemes 
consider a single collocation point, i.e. they evaluate the system dynamics 
only once in the integration step. Another difference with respect to Euler 
methods is its accuracy. Indeed, the trapezoidal method is of order 2, while 
Euler methods are of order 1. The order indicates the approximation error 
introduced by the method. Intuitively, a method of order k corresponds to a 
Taylor expansion stopped at the kt? order, without the need of computing all 
the partial derivatives. Hence, higher order methods allow obtaining more 
accurate results using the same integration step. 

All the these single step methods can be represented via a Butcher tableau 
[Butcher, 2016, Sec. 232]. It allows storing all the relevant coefficients in a 
condensed form. For example, the Butcher tables of Euler methods are 


Forward Euler: are Backward Euler: 1j 


while the implicit trapezoidal one is 


1/2 1/2 


In its general form, the table has the following structure: 


Cı | a11 012 °t Gis 
C2 | a21 G22 ++: arX~5 P 
: : i : = 4 (2.25) 
bT 
Cs | Qs1 Qs2 **° s,s 
bi bə ayes bs 
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c contains the portions of time step at which stages will be evaluated (i.e. 
stage i will be evaluated at time t + cidt). The i-th stage, defined here with 
the symbol ‘s is an intermediate evaluation of function f: 


îs (x(t), u(-),t) = s(20) + D9 ai; js, u(t + cidt), t + cat) , (2.26) 


j=1 


with ns the number of stages. Notice that if a;,; #0 for j > i the method 
is implicit, because stage i will depends on itself, and/or following stages 
that are yet to be computed. The evaluation of stage i requires knowing the 
control values at a fraction of the time step, depending on the value of ci. It 
is often practical to assume that u(t + cidt) = u(t), independently from ci. 
In this case, we write s (a(t), u(t),t). Finally, b contains the coefficients 
used to sum up all the stages, retrieving the final solution: 


a(t + dt) = a(t) +dt X bi's, (2.27) 
i=1 


where we momentarily dropped part of the notation for simplicity. We can 
rewrite this equation in a more compact form: 


a(t + dt) = x(t) + dtS (a(t), u(-), t) b, s= ['s 2g... Msg]. (2.28) 


The methods described by Eq.s (2.28) belongs to the Runge-Kutta in- 
tegrator family. A famous example is the 4th order Runge-Kutta method 
[Butcher, 2016, Sec. 235] whose Butcher tableau is the following: 


0/0 0 0 0 
1/2}1/2 0 0 0 
1/2] 0 1⁄2 0 0, 


Pi) 0 0 1 0 
| 1/6 1/3 1/3 1/6 


which is explicit. In case of explicit Runge-Kutta methods, it can be proven 
that the number of stages is greater or equal than the order of the method 
[Butcher, 2016, Sec. 324]. In particular, explicits methods for which the 
order is equal to number of stages have been found up to order 4. On the 
other hand, the method with the least amount of stages having order 8, is 
composed by 11 stages. 

To conclude, Eq. (2.28) can be used as a constraint in place of Eq. (2.4b). 
While it may be tempting to use methods with a high number of stages to 
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exploit their accuracy, they require composing function f multiple times. 
Hence, in the case of non-linear dynamics, this would make the optimal 
control problem difficult to solve. Conversely, the advantages of explicit 
methods are limited to the solution of initial value problems (IVP). An 
optimal control problem resembles more a boundary value problem (BVP) 
because it finds a path to a target state. In this case, the distinction between 
implicit and explicit methods becomes less important [Ascher et al., 1994]. 

Eq. (2.28) can be used recursively to determine the state evolution from 
to to T. This method is called shooting. 


2.3.2 Shooting 


Single shooting 


Let us consider a (LTT) system like the one in Eq. (2.11). If we discretize it 
with the forward Euler method, we obtain the following discrete system 


a(t + dt) = a(t) + dtAw(t) + dtBu(t). (2.29) 


By assuming dt = T, with N being the number of integration steps, it 
is possible to obtain a(t) as a function of the initial state and the control 
inputs only. In fact, 


x(to + dt) = a + dt Axo + dtBu(to), 
x(t + 2dt) = z (to + dt) + dtAx(tp + dt) + dtBu(to + dt), 


«(T) 7 x(T —dt)+dtAa(T — dt)+dtBu(T — dt), 


hence it is possible to remove all the intermediate state variables obtaining: 


wo) | 


a(T) = (Ly + dtA)% ap +C 2.30 
(M= (Aw tdtA ao +O] n oap (2.30) 
u(T — dt) 
where C € R”XN™ is the controllability matrix 
C= [ax +dtA)\aB .-- (Ly +dtA)dtB dtB|. (2.31) 


Henceforth, with a single shooting method it is possible to obtain the terminal 
state starting from the initial one, without having to consider all the inter- 
mediate states. In an optimization framework, Eq. (2.30) will correspond to 
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the constraint 


u(to) 
= No l -g(T)=0. ; 
qh:= (In + dt A) o +C wT = 2dt) (T) 0 (2 32) 
u(T — dt) | 


The set of optimization variables x would consist of the control inputs applied 
at each step. As a consequence, the partial derivative of gh with respect to 
xX, a.k.a. the constraint Jacobian, corresponds to C, which is a dense matrix. 

The applicability of this method is not limited to LTI systems, but it can 
be applied also to generic non-linear systems, using any explicit integration 
method, by applying Eq. (2.28) recursively. 


Multiple shooting 


If the system is nonlinear, the composition of function f N times (as it has 
been done in Eq. (2.30)) may result in a very complex expression. In addition, 
it is difficult to specify path constraints, i.e. those involving intermediate 
states. These problems, typical of the method just introduced, are solved 
using a multiple shooting approach [Bock and Plitt, 1984, Diehl et al., 2006], 
where all the intermediate state variables are also optimization variables. It 
corresponds to fractioning the prediction horizon in multiple segments in 
which we apply an integration method. This has the clear disadvantage of 
including much more variables and constraints in the optimization problem. 
In fact, it is necessary to add a constraint similar to Eq. (2.28) for each pair 
of optimization variables. The dynamical constraints are the following, 


0 = xo + dtS (xo, u(to), to) b — æ(to + dt), 
0 = æ(to + dt) + dtS (a(to + dt), u(to + dt), to + dt) b + 


— g(to + 2dt), 
h= | Ore) (2.33) 


0 = a(T — dt) + dtS (x(T — dt), u(T — dt), T — dt) b+ 
a x(T), 
where each constraint depends only on the optimization variables related 


to the corresponding collocation points. For example, the second constraint 
in Eq. (2.33), would depend on a(tp + dt), w(to + dt) and a(to + 2dt) only, 
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while the next one on a(to + 2dt), u(to + 2dt) and a(to + 3dt). By defining 
the optimization variable x as 


u(to) 
x(to + dt) 
u(to + dt) 
x = | x(to + 2dt) | (2.34) 


u(T — dt) 
«(T’) 
the constraint Jacobian would assume a block-diagonal structure which can 
be exploited by modern solvers like in [Wachter and Biegler, 2006]. 


Since the system dynamics is discretized, so it is the cost function. In 
particular, Eq. (2.3) would become: 


T(x) = S3 £ (x(to + idt), u(to + idt)) dt. (2.35) 


Similarly, if algebraic constraints are present, it is sufficient to “repeat” 
them across the time horizon, constraining only a portion of x depending on 
the time instant. To summarize, starting from the optimal control problem 
of Eq. (2.4), we obtain the following optimization problem: 


inimi Cat idt), w(t idt) ) dt 
minimize m ( S3 (to + idt), u(to + idt)) 


subject to: 
gl < g (xo, u(to), to) < gu 
0 = zo + dtS (xo, u(to), to) b — æ(to + dt), 
gl<g (x(to + dt), u(to + dt), to + dt) < gu 
0 = w(to + dt) + dt S (a(to + dt), u(to + dt), to + dt) b+ 
— 2(to + 2dt), 


al < g (x(T — dt), u(T — dt), T — dt) < gu, 
0 = æ(T — dt) + dtS (x(T — dt), u(T — dt), T — dt) b + 
=P), 


where the optimization variables x are defined as in Eq. (2.34). This 
optimization problem can be solved with an appropriate solver. 
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2.4 Receding horizon principle 


When solving the optimal control problem described by Eq. (2.4) using an 
indirect method, as presented in Sec. 2.2, we obtain a control law which is 
function of the state. For example, a linear-quadratic regulator can be seen 
as a linear state-feedback control law with an optimal gain, as shown in Sec. 
2.2.1. On the contrary, when using a direct method, like those of Sec. 2.3, 
the output is not a control law, but a series of control values. In fact, given 
Zo, we obtain u(to +idt), with i from 0 to N—1. The application of all these 
control inputs would result in open-loop control. Alternatively, we can apply 
u(tg) only, discarding all the other control inputs. The system evolves and a 
new feedback «x is retrieved. The time horizon is shifted of an amount equal 
to dt and the optimal control problem is solved again with a different initial 
condition. This method of discarding the control values except the first, and 
shifting the time horizon, is called receding horizon principle [Mayne and 
Michalska, 1990, Clarke and Scattolini, 1991]. It provides the basis for the 
so-called model predictive control (MPC) [Garcia et al., 1989]. The stability 
of MPC can be demonstrated also in the constrained case [Zheng and Morari, 
1995, Mayne et al., 2000]. 


2.5 Basics of non-linear optimization 


Consider the following optimization problem: 


minimize T(x) (2.36a) 
x 
subject to: 
eh (x) =0, (2.36b) 
ih(x) <0. (2.36c) 


Compared to Eq. (2.19), we separated equality constraints, eh : R"x > R", 
from inequality constraints, ;h : R"* — R™, assuming to have only upper- 
bounds, without loss of generality. A point x is said to be feasible if both the 
conditions (2.36b)—(2.36c) are satisfied. A feasible point x is locally optimal, 
if there exist a scalar R > 0 such that: 


T(x) = inf{T (2) | eh (z) = 0, ih (z) < 0, lz — x] < R}. (2.37) 


In other words, x is locally optimal if all the feasible points z in a hypersphere 
of radius R around it, provide a higher cost. If an inequality constraint 
k < nį is equal to 0, i.e. ;h (x), = 0, it is said to be active. 
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2.5.1 Convex optimization problems 


Let us start with few definitions. A set C is convex if the line connecting 
any two points in C is fully contained in C, i.e. 


C is convex <= > z,y €C:0x+(1—0)jy €C, VAE [0,1]. (2.38) 


A generic function fev is convex if its domain is a convex set and, for all x 
and y belonging to its domain, we have: 


fev (Oa + (1 — 0)y) < Ofev(z) + (1— A) fev(y) Y8 [0,1]. (2.39) 


Intuitively, this means that if we take two points on the surface representing 
fev, the line connecting them is “above” the surface. As an example, quadratic 
functions like x! Hx, with H positive definite, are convex. If in Eq. (2.36) 
the cost function and the inequality constraints are convex functions, while 
the equalities are affine, then the optimization problem is said to be convex: 


minimize [.y(x) (2.40a) 
x 
subject to: 
Acx = eb, (2.40b) 
iha (x) < 0. (2.40c) 


Notice that neither A, € R"*”", nor eb € R” depend on x. For this 
particular problem, local optima are also global, i.e. condition (2.37) holds for 
any R. The proof of this statement can be found in [Boyd and Vandenberghe, 
2004, Sec. 4.2.2]. Quadratic programming (QP) problems, constituted by 
a quadratic cost function and affine constraints, are an example of convex 
optimization problems. 


2.5.2 The Lagrangian 


Given the optimization problem defined in Eq. (2.36), we define the La- 
grangian L : R”x x R”! x R™ > R as: 


£ (X eA, ir) r(x) + eA! h(x) T: iA T;h(xX), (2.41) 


where eA € R”™ and ;A € R™ are called Lagrangian multipliers. 
The dual function g : R™ x R”° — R is obtained by taking the minimum 
of Eq. (2.41) over x: 


g (eA, iA) = inf L (X, eA, iA). (2.42) 
x 
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Interestingly, the dual function provides a lower-bound to the optimal 
cost value [* = T (x*), with x* the optimal point of (2.36), as we show in 
the following. For any eA € R", we have that 


AT h(x") = 0 
since x* is feasible. For the same reason, for any ;A > 0, we have 
iA! ih(x*) < 0, 
Hence, given the Lagrangian definition in Eq. (2.41), the following holds: 
L(x*, eA, iA) <T(x*). (2.43) 
In addition, g(-) is the minimum of £ (-) over x. We finally have 
g leA iA) <T(X*), VeA ER”, and ;A > 0. (2.44) 


Intuitively, the most meaningful lower-bound is obtained by maximizing 
g(-) over eX and ;A, because it provides the highest lower-bound for the 
optimal cost. It can be obtained by solving the following optimization 
problem, called the dual problem: 


maximize g (eA, iA) (2.45a) 


e^i 


subject to: 
À> 0. (2.45b) 


In case g (eA*, ;A*) = T(x*) holds, with -A* and ;A* the optimal La- 
grange multipliers, we have strong duality. It is usually a property of convex 
optimization problems [Boyd and Vandenberghe, 2004, Sec. 5.3.2], but it 
applies also in the non-convex case under some conditions called constraint 
qualifications [Giorgi et al., 2018]. One of these is the Linear Independence 
Constraint Qualification (LICQ). It requires the gradient of all active con- 
straints (equalities included) to be linear independent at the optimum. If we 
define aih as the set of active inequalities (i.e. aih(x*) = 0), the matrix 


e eh(x*) (2.46) 


Vxaih(x*) 


has to be full row rank. 
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2.5.3 Necessary conditions for optimality 

Let us consider the following assumptions. 
1. Functions [(-), -h(-) and ;h(-) are continuously differentiable in x*. 
2. x* is a locally optimal point, while -A* and ;A* solve problem (2.45). 
3. Strong duality holds. 


Then, for a point x* to be optimal, the following necessary conditions have 
to be satisfied [Kuhn and Tucker, 1951]: 


ch (x*) =0 (2.47a) 

ih (x*) <0 (2.47b) 

iA* > 0 (2.47c) 

iA jh(x*) =0 (2.47d) 

VEX) + MTV y ehl X) +A] Vy ih x) = 0. (2.47e) 


Eq. (2.47) defines the Karush-Kuhn-Tucker conditions (KKT). Eq.s 
(2.47a)-(2.47b) impose x* to be feasible. Similarly, Eq. (2.47c) is the dual 
feasibility condition. Condition (2.47d) is the so-called complementarity 
slackness. Notice that, because of Eq. (2.47b)-(2.47c), the complementarity 
slackness imposes every product ¿Af -ihk to be equal to zero. In other words, 
if an inequality is not active, i.e. ¿hk < 0, the corresponding Lagrange 
multiplier ;A7 has to be null. Conversely, if the inequality is active, the 
multiplier can be different from zero. Finally, condition (2.47e) ensures that 
the optimal point is also a stationary point of the Lagrangian. 

If the optimization problem is convex, KKT conditions are also sufficient 
[Boyd and Vandenberghe, 2004, Sec. 5.5.3]. Otherwise, second order optimal- 
ity conditions [Diehl and Gros, 2018, Sec. 3.3] have to be employed, involving 
the Hessian of the Lagrangian, i.e. its double derivative with respect to the 
optimization variables. This means that if a point x satisfies Eq. (2.47) 
(plus other second order conditions in case of non-convex problems), then 
x is optimal. As a consequence, a classical method for finding the optimal 
solution consists in applying root-finding techniques, like the Newton method 
[Betts, 2010, Sec. 1.5], to Eq. (2.47e). The complication here is represented 
by inequalities, because of constraints (2.47c)—(2.47d). A possibility then, is 
to assume to know which inequality is active and consider it as an equality. 
In other words we guess the active set. If after some iteration one of the 
constraints (2.47b)—(2.47c) is no more satisfied, we change our guess on the 
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set of active constraints [Betts, 2010, Sec. 1.9]. The solver qpOASES [Ferreau 
et al., 2014] uses this technique to solve QP problems. Interestingly, if the 
guess on the active set is correct, the global optimum of QP problems can be 
found in a single step [Betts, 2010, Sec. 1.10]. QP solvers can also be used 
to solve generic non-linear optimization problems by resorting to sequential 
approximations of the problem. This method is called SQP (sequential 
quadratic programming) [Betts, 2010, Sec. 1.13] and it is adopted in the 
ESA (European Space Agency) solver WORHP [Biiskens and Wassel, 2012]. 

Alternatively, the interior point methods exploit logarithmic barrier 
functions to deal with inequalities [Diehl and Gros, 2018, Sec. 4.3]. In 
particular, they solve the following optimization problem: 


minimize I(x) - D9 In (—ihe(x)) (2.48a) 
k 


subject to: eh = 0. (2.48b) 


When an inequality tends to zero, the cost tends to infinity hence implicitly 
preventing it to be violated. The multiplier yz is then lowered in an iterative 
manner to reduce the approximation introduced by the barrier function. An 
example of interior point solver is Ipopt [Wachter and Biegler, 2006], widely 
used in this thesis to solve generic non-linear optimization problems. 
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Modeling of Floating-Base Robots 


While Chap. 2 considers a generic dynamic model, in this chapter we focus 
on the modeling of floating-base robots. We start defining rotation and 
transformation matrices in Sec. 3.1, while the velocity of a rigid body is 
discussed in Sec. 3.2. We then extend these considerations in the multi-body 
case in Sec. 3.3. Finally Sec. 3.4 discusses multi-body dynamics and Sec. 
3.5 presents a fundamental modeling tool widely used in the thesis, i.e the 
centroidal dynamics. Sec. 3.6 presents some of the simplifications for the 
centroidal dynamics which are popular in the literature. 


3.1 Rotation and transformation matrices 


Let us consider two frames in R? called C and D, with a coincident origin. 
Here, and in the rest of the thesis, we assume the corresponding bases to 
be orthonormal. Define “d; € R? as the i-th base of frame D expressed 
in another frame F (equivalently “c; for frame C). Trivially we have that 
Dd; = e;. At the same time, given a point p whose coordinates are defined 
in frame D, namely Pp, we obtain the corresponding coordinates in frame 
C as follows: 

Cp= a Cdo Cds Dp = C Rp?p. (3.1) 


C Rp is the rotation matrix and it belongs to SO(3), i.e. the set of R°*? 
orthogonal matrices with determinant equal to 1. 

Let us consider now another frame F having a different origin. Define as 
Cop the position of the frame F origin measured in frame C. Then, we can 
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transform the coordinates of a point tq into C as follows: 
Cq = Re" q+ “or, (3.2) 


or in a more compact form 


C Co F 
l- fone“) Et a” 


In the following, with a slight abuse of notation, we simply write 
Cq = Hr’ q, (3.4) 


where the transformation matrix “Hp belongs to the set of R*** called 
SE(3). If we assume these frames to be rigidly attached to rigid bodies, 
transformation matrices can be used to retrieve their relative pose. 

The transformation matrix © Hp, also called homogeneous transformation, 
can be easily inverted starting from Eq. (3.2): 


CRE -O R} o 
Cyy-1 F PF OCF 
H; = 3.5 
F Fe 1 | ( ) 


where we exploited the orthogonality properties of the rotation matrix. 


3.2 Rigid body velocity 


It is possible to define the 6D velocity of a rigid body in multiple ways. 
Throughout the thesis we make use of different representations depending on 
the specific case. These trivializations, as they are also called, arise from the 
fact that the relative velocity between two frames can be represented with 
different coordinate vectors depending on the frame used to measure it. 

Defining C and D as two frames moving with respect to each other, we 
define the time derivative of the relative transformation as: 


. d CR Cò 
C : D D 
Hp := ai ( Hp) = ee 0 | ‘ (3.6) 


A more compact representation can be obtained by multiplying it on the left 
with the inverse of the transformation matrix: 


CH-1C Hp = CRD -ORp°op| |°Rp Cop 
D 01x3 1 01x3 0 


(3.7) 


_|PRDCR, OR}Oòp 
01x3 0 l 
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Note that CREORp is skew-symmetric. In fact, starting from the equality 


CREO Rp = 13, (3.8) 
by differentiation we have 
CREO Rp + CRS Rp = 03x3. (3.9) 
Hence, we have 
CRL Rp =- (CRERp) (3.10) 


which is the definition of a skew-symmetric matrix. 
The information contained in Eq. (3.7) can be summarized by two 3D 
vectors, the linear and angular velocity, as follows 
van = CR} op (3.11a) 


t Vv 
Pwop = ("RD Ro) (3.11b) 


composing the left-trivialized 6D velocity: 
D Poc p 
Vop= |p > |- (3.12) 


It is also called body velocity. In fact, assuming D to be attached to a moving 
rigid body and C an inertial frame, the superscript indicates that the velocity 
coordinates are expressed in the moving D frame. With an abuse of notation, 
we can express with the operator V (and A for its inverse) the “extraction” 
of a 6D velocity from the matrix defined in Eq. (3.7): 


(3.13) 


Alternatively, we can multiply “Hp on the right side by the inverse of 
the transformation matrix: 


CRp a fe a 


Og Ctrl 
Hp“ Hy, = 
ae Ee 0 01x3 1 


(3.14) 


7 CRpo RE Còp _ CRpCR}Cop 
01x3 0 : 
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It is trivial to show that CRp°C RE is skew-symmetric, similarly to what 
has been done starting from Eq. (3.8). Then, we can construct the right- 
trivialized velocity 


C 
C r UC,D 
Veo E | (3.15) 
It is composed as follows: 
CVE p = Còp z~ CRp°R)° op (3.16a) 
. v 
Caen ("Èo RD) : (3.16b) 


This is also called inertial velocity since it is measured on the C frame. 
Notice that the linear velocity can be alternatively written as Coc D= 
Còp + Cot won. 

In some applications, it may be helpful to define the 6D velocity as 


Còp 
ke] = 


thus having the linear velocity corresponding to the time derivative of the 
relative position of the two frames. This is particularly helpful when we use 
one of the integration techniques introduced in Sec. 2.3.1 to retrieve the 
corresponding position over time. Interestingly, the quantity in Eq. (3.17) 
corresponds to the velocity expressed in a frame having the same origin of 
D, but oriented as C. Such a frame is indicated with the notation D[C]. It 
is called hybrid or mized velocity representation [Traversaro, 2017]: 


CR 0 CRICG 


(3.18) 
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The equality “Rp? WOLD wc,p is proven as follows: 


A ^ 

(C RoPwcp]) = CRp (Pwop) CRL (3.19a) 
a AVN ^ 
x Chy ( (CRE Ro) ) CRI (3.19) 
= °Rp° RPO Rp RE (3.19c) 
=°Rp RJ (3.19d) 
B A 

= ( wWo.p) (3.19e) 
CRp?wop = CwoD, (3.19f) 


where in Eq. (3.19a) we used the rotational equivalence of cross products. 
Similarly, the adjoint matrix © Xp allows obtaining a right-trivialized 
velocity starting from the left-trivialization: 


Vo p. (3.20) 


CR C ACR 
Van =°xXpPVoo = | Wr 7 


03x3 CRp 


This highlights that the relative velocity between the two frames, intended 
as a physical entity, is unique. Nevertheless, it assumes different coordinates 
according to the frame in which it is expressed. Thus, without loss of 
generality, in the following we use mainly the left-trivialized version. 

In general, the adjoint matrix can be used to express a velocity in a 
different frame. For example, let us express the velocity ? Vc,p in another 
frame F at a constant relative pose from D, * Hp. It can be shown that 
FVo p ="Vop (there is no relative velocity between D and F). Hence, 


id v 
FVop =" Vop = (CHS ("HoP Hr) ) (3.21a) 
x Vv 
= (PH; Hp Hp? Hr) (3.21b) 
Vv 
= (P Hp? Vn” Hr) (3.21¢) 
="Xp?Voap. (3.21d) 


The passage from Eq. (3.21c) to (3.21d) is similar to the rotational equivalence 
of cross products used previously, and it can be verified by hand calculations. 
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Let us drop the assumption of constant relative pose for the frame F. Its 
velocity with respect to C is the following: 


Vv 
Vor = (CHS (CH? Hr) ) (3.22a) 
é é Vv 
= (PH; Hp Hp? Hr +° Hp O Hp? Hr) (3.22b) 
z Vv 
= (P Hp? Ven? Hr + PHP” Hr) (3.22c) 
= FXp? Vop + FV) p, (3.22d) 


which provides an expression for the composition of velocities. 


3.3 Multi-body kinematics 


Let us analyze a robot composed of n; +1 rigid bodies, called links, connected 
by nj; joints with one degree of freedom each. An example of this type of 
kinematic structure is presented in Fig. 3.1. Each link has an associated 
frame. We also assume that none of the links has an a priori constant pose 
with respect to an inertial frame, i.e. the system is free floating. 

We can arbitrarily decide one of the rigid bodies to be the base link B, 
whose transformation with respect to the inertial frame is defined as 7Hp, 
and composed as follows: 


(3.23) 


We name 7p(7) as the set of joints connecting link 7 to the base. In 
addition, we assume the robot to have no closed kinematic chains. The joints 
in mg(i) are ordered such that the (k + 1)-th joint is not in the path from 
B to the k-th joint. Hence, for a given choice of B, the set 7g(7) is unique. 
For example, given Fig. 3.1, if we define link 1 as our base, we have that 
m™1(6) ={I, V}, i.e. the path from link 1 to link 6 includes joints I and V. If 
we change base, e.g. link 5, we have 75(1) ={IV, I} and 75(6) ={IV, V}. 

We define \g(k) and mp(k) as the parent and child link of joint k, 
respectively. Each joint has a single parent and child link, but a rigid body 
can be connected to multiple joints, hence obtaining a tree structure. For 
example, referring again to Fig. 3.1, we have ;(III) = 2, i.e. if we choose 
link 1 as base, link 2 appear to be the parent of joint III, while m;(III) = 3. 
If we choose link 3 as base, these two would be inverted. 
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Fig. 3.1 Schematic representation of a multi-body structure. Links are in 
blue, while joints are in grey. These are assumed to be revolute with a single 
degree of freedom. The corresponding axis of rotation is indicated with a 
dotted line. Each joint is associated with a roman numeral, while each link 
has a frame attached, named with a number. 
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We can finally reconstruct the position of a link L as follows: 


7H, ="*Hp [| Hn (3.24) 
kenp(L) 


The transformation matrix *2(*) H p(k) depends upon the position of 
joint k, ie. ABC) H m p(k) (Sk): In this thesis we consider only revolute joints, 
characterized by an axis a € RÌ with unitary modulus. Let us consider an 
additional two frames \g(k) and mpg(k) belonging to the parent and child 
link respectively. They are placed such that `e) H np) (0) = 14. Thanks 
to this assumption, a has the same (constant) coordinates when defined 
either in \p(k) or mpg(k). Then, the joint transform is obtained thanks to 
the axis-angle formalism: 


7 AB(k) R- 0 
AB(*) H a (ey (Sk) = hee n A 
r3() R anf (Sk) = 13 + cos(sz)(,a)” + sin(sp) ((,a)*)”, 


obtaining the final relation 
AB) Fl a n(e) (Sk) =P yy) O Hg Se)? Hinge) (3-26) 


Eq. (3.24) defines the forward kinematics function. It returns the 
absolute pose of a link given the base pose and the joint values s € R™, i,e. 
FK : SE(3) x R™ — SE(3): 


TH, = FK (Hs, s) l (3.27) 


Let us now define the velocity of a link. We can express Eq. (3.24) by 
separating the part depending on the base pose from the joints dependency: 


TH, =1Hp” Hy, (3.28) 
thus, by applying Eq. (3.22), we obtain 
LV 1, = Xp” Vr g + "VB L. (3.29) 


While ? Vz,pB is assumed to be known, is Vz,1 is a function of joints velocity 
s € R™. In fact, 


BHA, = D 7 Aya) 5— (PO Hse) mak) H;i, (3.30) 


and, by using the same derivation of Eq. (3.21), we obtain 


i V 
Ly, = (PH; 7 HL) 
= AB(k mB (k 
= ` (moz B(k) H w) B( H) Šk 
kerp(L 
Vv 
= SD (Hp A ô (rsh) Fy mak) et, ) ŝ 
mp (k) melk) Og, mp(k) L k 
kerp(L 
Fa) Vv 
ae AB(k —1 àp(k 
= >, PX more) ( g 58) Baz ( P Hms) ) $k 
kerp(L 
LVg r = yo ” Xing (kh) 2s Se. (3.31) 
kenp(L 


malk)s (more verbosely melh) s) | (&) ma(k)) is the left-trivialized joint motion 
subspace. It is constant and it can be proven that, for a revolute joint, this 
depends on the rotation axis ,a, [Traversaro, 2017]: 


Wi (0) 
BO E S | 3 (3.32) 


To summarize, “Vp L is an affine function of joints velocity $ € R™, 
allowing us to write: 
Vp r = "JB r$, (3.33) 


and including also the base velocity, 


B 
ae = “Jor v. (3.34) 


LW; = Xp EI$ z] 


“Ir c R&x(6+n5) is the left-trivialized Jacobian of link L, while v € Rt” 
is the multi-body system velocity. The k-th column of 4J A z is equal to 


X ma melk)s if k € Tg(L), the zero vector otherwise. 


3.4 Multi-body dynamics 


The robot configuration space is characterized by the position and the 
orientation of the base frame B, and the joint configurations. Thus, it 
corresponds to the group Q = R? x SO(3) x R” and an element q € Q can 
be defined as the following triplet: q = (fpg,? Rp, 8). 
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The velocity of the multi-body system can be characterized by the algebra 
V of Q defined by: V = R? x R? x R”. An element of V corresponds to v. 

We also assume that the robot is interacting with the environment 
exchanging ne distinct wrenches!. Due to the fact that the configuration 
space is not vectorial, we cannot apply the classical Euler-Lagrange equations. 
This is solved by employing the Euler-Poincaré formalism [Marsden and 
Ratiu, 2010, Ch. 13.5], obtaining as a final result: 


Ts +Y Ie, nf (3.35) 


M(q)v + C(q,v)v + G(q) = E 
k=1 


where M e R"+6*"+6 is the mass matrix, C € R("+®)*("+8) accounts for 
Coriolis and centrifugal effects, G € R"*+® is the gravity term. Ts € R” is a 
vector representing the internal actuation torques, while f € RÊ denotes the 
k-th external wrench applied by the environment on the robot. In particular, 


it is composed as follows: 
SAET 
pf = | ql (3.36) 


k 


with kf, pT € R being the contact force and torque respectively. The 
Jacobian Je, = Jc,(q) is defined in different trivializations depending on 
the choice of frame used to measured the external wrench. In most of the 
applications described in this thesis, we assume the wrenches to be measured 
in a frame located on the contact link origin and oriented as Z. Hence, Je, is 
expressed in mixed representation. By stacking all the Jacobians and contact 
wrenches, we can rewrite Eq. (3.35) as follows: 


; O6xn 
M(q)v + C(q,v)v + G(q) = | i Ts + Jef (3.37) 
where 
Je, (q) if 
Jc(q) = : , f=|: j. (3.38) 
Jen (q) kf 


Lastly, it is assumed that a set of holonomic constraints acts on System 
(3.37). These are of the form c(q) = 0. The holonomic constraints associated 
with all the rigid contacts can be represented as 


Jc(q)v = 0, (3.39) 


1 As an abuse of notation, we define as wrench a quantity that is not the dual of a twist, 
but a 6D force/moment vector. 
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indicating that the velocity of the associated links is supposed to be zero. 
Eq. (3.39) can be differentiated 


Jet + Jev =0 (3.40) 


obtaining a dependency on ù. Eq.s (3.37) and (3.40) together are the dynam- 
ical equations describing the motion of a floating-base system instantiating 
contacts with the environment. 

As described in [Traversaro, 2017, Sec. 5], it is possible to apply a 
coordinate transformation in the state space (q, v) that transforms the system 
dynamics (3.37) into a new form where the mass matrix is block diagonal, 
thus decoupling joint and base frame accelerations. Also, in this new set of 
coordinates, the first six rows of Eq. (3.37) corresponds to the centroidal 
dynamics. In the specialized literature, this term is used to indicate the rate 
of change of the robot’s momentum expressed at the center of mass, which 
then equals the summation of all external wrenches acting on the multi-body 
system [Orin et al., 2013, Wensing and Orin, 2016]. 


3.5 Centroidal dynamics 


By definition, the center of mass (CoM) acom corresponds to the weighted 
average of all the links CoM position: 


Mei 5 
sem = Hp Y, m H; ‘Pcom: (3.41) 
i 


where pcom € R? is the (constant) CoM position of link i expressed in i 
coordinates. m,m; E€ R are respectively the robot total mass and the i-th 
link mass. In order to introduce the centroidal dynamics, it is convenient 
to define a frame, called G, whose origin is located on the CoM, while the 
orientation is parallel to the inertial frame Z. We introduce gh € RÊ as the 
robot total momentum expressed in this frame, such that 


~h? 
gh = H (3.42) 


where gh? € R? and gh” € R? are respectively the linear and angular 
momentum. In addition, the following holds: 


1 
toom = —Gh?. (3.43) 
m 
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The robot total momentum corresponds to the summation of all the links 
momenta, projected on the same frame G: 


gh =) @X? eh. (3.44) 


Notice that the adjoint matrix is the one used to transform wrenches. They 
are the dual of the adjoint matrices presented in Sec. 3.2. In particular 
pXČ := CXT, We can expand all the momenta and write: 


gh = GX? Y BXL Vai (3.45) 
i 


with I; € R°*® being the (constant) link inertia expressed in link frame. 
Hence, it is a function of the robot velocity v: 


gh = JCMMY (3.46) 


where Jemm € R®°*” is the Centroidal Momentum Matrix (CMM) [Orin and 
Goswami, 2008]. 

The centroidal momentum rate of change balances the external wrenches 
applied to the robot: 


S (3.47) 


Ne 


T 
Ry 03x3 z 
= J f+m 
4 (Fox, as com)” IR, TR, k g 


The adjoint matrix ¢X k € RX transforms the contact wrench from the 
application frame (located in 70, with orientation 7R,) to G. Finally, 
g=[90-g00 o]! is the 6D gravity acceleration vector. 
Alternatively, the centroidal momentum dynamics can be obtained by 
differentiating Eq. (3.46): 
ahi = Jommv + Jemmy, (3.48) 


thus highlighting its dependency on v. 


3.6 Simplified models 


In this short section we introduce how the robot dynamics is simplified 
obtaining two linear models which are widely used in the legged locomotion 
literature: the linear inverted pendulum model and the Capture Point. 


48 


3.6.1 The linear inverted pendulum 


The linear inverted pendulum model (in short LIP) approximates a legged 
robot by considering only the center of mass, represented as a point mass on 
the top of a pendulum with no inertia. The other edge is the “foot”, set in 
rigid contact with the ground. pæ € R? is its position in an inertial frame. 

The LIP dynamics is obtained starting from Eq. (3.47) considering only 
the linear momentum. We assume a single external force to be present, called 
pf € R3. This is measured in an inertial frame, applied on pæ, and parallel 
to the pendulum. Thus, the equations describing the LIP dynamics are: 


MËCoM = pf + Mg, (3.49a) 
(px = gcom) ^ pf =0. (3.49b) 


Notice that, thanks to Eq. (3.49b), the angular momentum is constant, i.e.: 
eh? =0. (3.50) 
Let us assume now the point mass to be always at a constant height, i.e. 
e3 (p£ = CoM) = ZCoM,z- (3.51) 

This implies ma@com,z = 0, which corresponds to 
pfe = mlg. (3.52) 


The remaining components of pf can be computed starting from Eq. (3.49b) 
and Eq. (3.51): 


pfx = mw’e] (aco — p£), (3.53a) 
pfy = mweg (com T pæ) D (3.53b) 


where w = 4/ = dit is the dominant time constant of the LIP. 


By substituting Eq.s (3.52)-(3.53) into Eq. (3.49), we obtain: 

e1 

Soa = | eg | w? (toom — pæ) . 
03x1 


Since the dynamics is different from zero only in the planar coordinates, we 
can define 


oT 
TLIP = zl CoM, Zip ER’, (3.54) 
2. 
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and, without loss of generality, we consider pæ € R?. We then obtain the 
LIP dynamic equation, 


Eup = w’ (£P — px), (3.55) 


which is linear. It possess a single equilibrium point, namely 


TLIP| | px 
; = : 3.56 
This model can be enriched considering also a finite sized-foot [Kajita 
et al., 2001a,b, Koolen et al., 2012]. In this case, it can be shown that pæ 


corresponds to the Zero Moment Point position £zzmp [Vukobratovié and 
Borovac, 2004], instead of the foot position. 


3.6.2 The Capture Point 


The Capture Point [Pratt et al., 2006] is defined as the point on the ground 
where the robot has to step in order to come to a full stop, as shown in Fig. 
3.2. Eq. (3.55) can be rewritten as follows?: 


= 4 nit (3.57) 


with o = |i, dp] . The dynamical system of Eq. (3.57) can be 
explicitly solved [Englsberger et al., 2011], obtaining the following relation: 


n | cosh(wt) ies o(0) + ; — n ae (3.58) 


wsinh(wt) cosh(wt) —w sinh(wt) 


Eq. (3.58) can be used to find an expression for the Capture Point starting 
from its definition. In order for the pendulum to come to a complete stop, 
its asymptotic position needs to converge to the equilibrium point. Hence, 
let us substitute £p with pæ, obtaining 


1 
cosh(wt) aprp(0) + 7 sinh(wt) urp (0) — cosh(wt) pæ = 0, (3.59) 


2 With an abuse of notation we write the dynamical system matrices as if £p and its 
velocity were 1-dimensional, exploiting the fact that planar directions are decoupled. 
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from which we get 
1 ; 
p£ = xprp(0) + 5 tanh(wt) urp (0). (3.60) 


Since we are interested in the asymptotic solution, we let t — co which 
results in: 


1, 
p£ = zip (0) + | eu (0). (3.61) 


Hence, given a generic state ø, the Capture Point xop € R? is defined as: 


1. 
TOP = TLIP + Pela (3.62) 


In order to find the Capture Point dynamics, let us analyze again the 
system of Eq. (3.57). The associated eigenvalues are: 


Aq 2 = EW (3.63) 


’ 


which indicate a neutral saddle point (an equilibrium with two real eigenvalues 
symmetric with respect to the imaginary axis [Khalil, 2002, Section 2.1]). 
The corresponding eigenvectors are: 


1 1 
wo —W9 


v = (vı v2) = | (3.64) 


We diagonalize the dynamical system of Eq. (3.57) through the relation: 


B 


where the transformation matrix T and its inverse T~! are defined as 


1/2 1/2 P 1 rive 
ao Re =) ie f e i oo 


o=T He EER’, BER’, (3.65) 


The matrix T is obtained by multiplying the eigenvectors matrix by 1/2. It 
is possible to notice that, using this transformation matrix, € is identical to 
zcp. Then, the corresponding diagonal system is: 


E|  |aop|__ |w 0 | |acp —w 
$ Nail eaa +| | pE (3.67) 
Hence, the Capture Point dynamics 
TCP =W (xop E pT) (3.68) 


corresponds to the unstable part of the LIP dynamics (positive eigenvalue). 
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Fig. 3.2 Illustration of the Capture Point concept. It is the point over which 
the pendulum “steps” in order to stop in the upright position. This point 
depends on the wz,;p position and velocity. 
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State of the Art and Thesis Context 


This short chapter presents the state of the art, focusing on the different 
components of a layered architecture. Sec. 4.2 contextualize the thesis. 


4.1 State of the art 


Humanoid robots are underactuated [Spong, 1998] which, in essence, means 
that by exploiting all theirs actuated degrees-of-freedom they can control 
their internal configuration, but they cannot affect their global pose directly. 
Nevertheless, it is possible to circumvent this limitation by exploiting contacts 
with the environment and the robot ability to change “shape”. Additional 
difficulties arise from the fact that contacts may change over time. This 
results in a different evolution of the constrained dynamical system making 
the overall system hybrid [Lygeros et al., 1999], i.e. it possesses both a 
continuous and discrete time dynamics. 

A recent approach for bipedal locomotion control that became popular 
during the DARPA Robotics Challenge, consists in defining a hierarchical 
control architecture [Feng et al., 2015]. Each layer of this architecture 
receives inputs from the robot and its surrounding environment, and provides 
references to the layer below. The lower the layer, the shorter the time 
horizon that is used to evaluate the outputs. Also, lower layers usually 
employ more complex models to evaluate the outputs, but the shorter time 
horizon often results in faster computations for obtaining these outputs. 

From higher to lower layers, trajectory optimization for footstep planning, 
simplified model control, and whole-body control represent a common strat- 


53 


ification of architectures for bipedal locomotion control [Carpentier et al., 
2017, Romualdi et al., 2018}. 


4.1.1 Trajectory optimization for footstep planning 


This first layer is in charge of finding a sequence of robot footsteps, which 
is also crucial for maintaining the robot balance. A fundamental ability 
of humanoid robots consists, for example, in moving over steps and stairs, 
where they can exploit the legged configuration. A possible approach for the 
generation of humanoid motions over stairs is to tackle the problem as an 
extension of the planar walking motion generation [Hirai et al., 1998, Hu 
et al., 2016, Caron et al., 2019]. More in general, when planning locomotion 
trajectories, the definition of contacts plays a central role. Several strategies 
are available in literature, here summarized in four categories. In particular, 
we distinguish them depending on how much information on the contact is 
supposed to be provided by the designer. 


1. Everything may be set by the designer, including contact sequence, 
location and timing. In other words, it is specified beforehand where, 
when and how the robot is supposed to step. Planning is performed 
on the remaining quantities, typically CoM state and body postures. 


2. With respect to the previous case, the designer may specify only the 
sequence of contacts. The actual contact timing and location are an 
output of the trajectory planner. 


3. In an increased level of detail, it is possible to model the contact 
activation and the deactivation through integer variables. 


4. Lastly, it is possible to model contacts as part of the dynamical system 
describing the robot behavior. With such a complete modeling, all 
the contact related quantities are an output of the planner. In this 
particular category, we focus on complementarity-free methods which 
do not rely on classical complementarity conditions. The reason behind 
this choice is addressed in Chapter 8. 


In the following, we expand these four categories. 
Fixed contact sequence, timing and location. The assumption of 
knowing in advance where the contacts will be established and in which 


instant simplifies the planning problem. Planning then focuses on the CoM 
state. In [Caron and Kheddar, 2016] only the CoM linear acceleration is taken 
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into consideration, while in [Dai and Tedrake, 2016] the authors propose a 
convex upper bound of the angular momentum to be minimized. In [Ponton 
et al., 2016] the derivative of the angular momentum is approximated by 
using quadratic constraints together with slack variables necessary to keep 
the approximation error low. Nevertheless, in this approach, it is not possible 
to directly penalize the use of the angular momentum, while it introduces 
many additional variables into the formulation. Similarly, in [Fernbach et al., 
2019] the CoM trajectory is forced to be polynomial with only one free 
variable. The resulting optimization problem is convex with the drawback 
of limiting the set of trajectories that can be generated depending on the 
choice of polynomial function. 

Body postures can also be planned together with the centroidal quantities, 
thus considering also the robot kinematic structure, [De Lasa et al., 2010, 
Herzog et al., 2015, Kudruss et al., 2015, Posa et al., 2016, Serra et al., 2016, 
Fernbach et al., 2018]. 

These methods need to rely on external contact planners. In some 
applications, where multiple contacts can be established in several regions, 
this approach may be the most viable solution. In this case, search algorithms 
can be used to determine suitable contact locations [Chestnutt et al., 2003, 
2005, Perrin et al., 2011, Stumpf et al., 2014, Karkowski et al., 2016, Griffin 
et al., 2019, Lin et al., 2019]. 

Under the same simplifications, authors of [Feng et al., 2013, Budhiraja 
et al., 2018, Giraud et al., 2020] adopted Differential Dynamic Programming 
to generate whole-body motions, thus solving the optimization problem using 
the Bellman’s principle of optimality presented in Sec. 2.2. These are some 
of the very few applications of indirect methods in humanoid robotics. 


Predefined contact sequence. During locomotion, it can be assumed to 
know in advance the contact sequence. As an example, for a biped robot, it 
can be assumed that a contact with the left foot will be followed by another 
one with the right foot. Recently [Carpentier et al., 2016, Caron and Pham, 
2017, Winkler et al., 2018], authors employ a similar method where contact 
phases are predefined. By specifying a different set of equations depending 
on the contact state, the hybrid nature arising from the establishment of 
contacts is easily modeled. The time spent in each phase can be turned into 
an optimization variable. Nevertheless, in case of several point contacts, the 
definition of the various phases could become intractable. 
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Mixed integer programming. Instead of receiving the contact sequence 
as input, it is possible to use integer variables to determine where a contact 
should be established [Deits and Tedrake, 2014, Mason et al., 2018, Mirjalili 
et al., 2018] and in which instant [Ibanez et al., 2014, Aceituno-Cabezas et al., 
2018]. These approaches require Mixed Integer Programming tools. While 
providing enhanced modeling capabilities, the exploitation of integer variables 
strongly affects the computational performances, especially in case several 
contacts can be established. Furthermore, the availability of specialized 
solvers is limited. 


Complementarity-free contact modeling. The complementarity-free 
approach allows simulating multi-body systems subject to contacts, without 
enforcing complementarity conditions directly [Todorov, 2011, Drumwright 
and Shell, 2010]. Equivalently accurate results are obtained by maximizing 
the rate of energy dissipation. These methods consider contacts explicitly 
inside the planner. Hence, contact location, timing and sequence are decided 
directly by the planner, allowing to generate complex motions [Mordatch 
et al., 2012, Tassa et al., 2012, Erez et al., 2013). 


4.1.2 Simplified model control 


This layer is in charge of finding feasible trajectories for the robot center of 
mass along the walking path. The computational burden to find feasibility 
regions, however, usually calls for simplified models to characterize the robot 
dynamics. Indeed, when restricting the robot center of mass (CoM) on 
a plane at a fixed height and assuming constant angular momentum, it 
is possible to derive simple and effective control laws based on the well 
known Linear Inverted Pendulum (LIP) [Kajita et al., 2001a,b] and the 
Capture Point [Pratt et al., 2006], as also introduced in Sec. 3.6. The 
Divergent Component of Motion (DCM) is a similar concept [Takenaka et al., 
2009]. Initially conceived in 2D , it has been extended in the 3D case too 
[Englsberger et al., 2013]. The Capture Point and the DCM can be adopted 
to draw stability properties of walking motions [Englsberger et al., 2011, 
Koolen et al., 2012, Krause et al., 2012]. These simplified models have a 
widespread diffusion, also amongst torque controlled robots [Stephens and 
Atkeson, 2010a, Pratt et al., 2012, Dafarra et al., 2016, Griffin et al., 2017, 
Englsberger et al., 2018]. Thus, they present high level of robustness, also 
in case of strong perturbations as it is usually the case when controlling a 
humanoid robot in torque mode. 
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The linearity of the model is based on the assumption of constant CoM 
height, resulting in a fixed pendulum constant w. Instead, when considering 
w as a tuning parameter, the model linearity can be preserved even in case of 
varying CoM height [Englsberger et al., 2013]. If it is left unconstrained, it is 
possible to generate walking trajectories with a straight knee configuration 
[Griffin et al., 2018]. In other works, balance is maintained by varying 
only the CoM height [Koolen et al., 2016b], hence completely removing the 
assumption for it to be constant. The robustness properties of this strategy 
have been further studied through the use of Sums-of-Squares optimization 
[Posa et al., 2017]. A similar model has been used also in a multi-contact 
scenario [Perrin et al., 2018] and extended considering zero-moment point 
(ZMP) [Vukobratovié and Borovac, 2004] motions to determine stability 
conditions [Caron and Kheddar, 2017, Caron and Mallein, 2018]. These 
approaches have been used to control a humanoid robot considering only 
single support phases. 

These simplified linear models have allowed on-line model predictive 
control [Wieber, 2006, Diedam et al., 2008, Missura and Behnke, 2014, 
Naveau et al., 2016, Griffin and Leonessa, 2016, Bombile and Billard, 2017, 
Joe and Oh, 2018], also providing references for the footstep locations in 
the form of deltas with respect to desired values. In addition, it is possible 
to derive MPC schemes with the guarantee of producing “stable” CoM 
trajectories using the DCM model [Scianca et al., 2016]. 

Model predictive controllers are appealing also for controlling hybrid 
systems [Bemporad et al., 2002, Lazar et al., 2006] since the full hybrid 
model can be exploited. Indeed, thanks to the prediction capabilities, it is 
possible to include inside the formulation both time- and state-dependent 
switching, performing anticipatory actions for the imminent variation in 
the dynamics. However MPC does not solve those problems related to the 
numerical integration of hybrid systems, which indeed are an open research 
problem [Olejnik et al., 2017, Azhmyakov, 2019, Rijnen et al., 2019]. 

When complete robot models are combined with impact dynamics, the 
control problem increases in complexity. Furthermore, ensuring stability 
and convergence properties of the underlying system requires employing 
control frameworks developed for hybrid systems [Engell et al., 2003]. A 
possibility to ensure these properties is to use the control approach based on 
virtual constraints and hybrid zero dynamics [Grizzle et al., 2001, Westervelt 
et al., 2003, Grizzle et al., 2014, Ames et al., 2014], which has also been 
proved in real experiments performed using humanoid robots with point feet 
[Chevallereau et al., 2003] or passive compliant ankles [Reher et al., 2016]. 
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4.1.3 Whole-body controllers 


The whole-body controllers are instantaneous algorithms that usually find 
(desired) joint torques and contact forces achieving some desired robot accel- 
erations. In this framework, the generated joint-torques and contact forces 
can satisfy inequality constraints, which allow imposing friction constraints. 
The outputs of these controllers shall ensure the tracking of reference posi- 
tions coming from the simplified model control layer. Although the reference 
positions may be stabilized by a joint-position control loop, joint-torque 
based controllers have shown to ensure a degree of compliance, which also 
allows safe interactions with the environment [Ott et al., 2011, Saab et al., 
2013]. From the modeling point of view, full-body floating-base models are 
usually employed in QP controllers. 

These controllers are often composed of several tasks, organised with 
strict or weighted hierarchies. In the former case, null-space projections are 
performed, making sure that a low priority task does not interfere with those 
at higher priority [Park and Khatib, 2006, Wensing and Orin, 2013, Herzog 
et al., 2014, Nava et al., 2016a, Pucci et al., 2016, Koolen et al., 2016a, Padois 
et al., 2017]. Instead, in the latter case, all the tasks are usually included in 
a cost function, regulating the relative priority through a set of weights [Lee 
and Goswami, 2012, Bouyarmane and Kheddar, 2017]. With this approach, 
even the highest priority task is affected by other tasks depending on the 
choice of weights. Hence, this approach may require more tuning. On the 
other hand, there is less chance of task unfeasibility. This is the case when 
high priority tasks require an unfeasible effort from the robot. At the same 
time, even if all tasks are feasible, ill-defined tasks may still cause unwanted 
motions. Hence, in the weighted case, all tasks act as regularization, avoiding 
to pick solutions which are good for one task but inefficient for all the others. 
In [Stephens and Atkeson, 2010b] multiple tasks are achieved by introducing 
fictitious desired wrenches in the controller formulation. 

Interestingly, in [Henze et al., 2014, Koenemann et al., 2015, Neunert et al., 
2018, Giftthaler et al., 2018], an MPC controller is applied for whole-body 
balance and tracking of walking trajectories. 
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4.2 Thesis context 


4.2.1 Part II: 
Predictive Control Based on Simplified Models 


This part presents three applications of optimal control and trajectory 
planning to humanoid locomotion. Here follows the context for each chapter. 


Chapter 5: A Push Recovery Strategy for the iCub Humanoid 
Robot 


We present a momentum-based whole-body torque controller based on a 
MPC formulation. In particular, as in [Caron and Kheddar, 2016, Dai and 
Tedrake, 2016, Ponton et al., 2016] the dynamic evolutions of the robot linear 
and angular momentum are taken into account. We thus make use of a 
reduced model: differently from simplified models used in literature, the robot 
momentum is an exact model that captures the global behaviour of the robot. 
We deal with the complications introduced by the derivative of the angular 
momentum by resorting to a Taylor expansion. Indeed, planners which resort 
to simplified models usually neglect angular momentum, supposing it to be 
constant. Exceptionally, in [Seyde et al., 2018] the angular momentum is 
considered while defining a desired Capture Point trajectory. 

The presented controller allows dealing directly with the intrinsic hy- 
brid dynamics of the system by considering time-varying constraints. The 
computed control inputs, i.e. the contact wrenches, are directly applied to 
the robot, rather than being used to define a joint reference trajectory, thus 
representing a particularity of our approach. 

The controller architecture inherits its structure from the momentum- 
based whole-body torque controller [Nori et al., 2015, Pucci et al., 2016, Nava 
et al., 2016b] implemented on iCub. Torque control is particularly suitable 
for our application given that it permits absorbing the impacts efficiently, 
maintaining the balance also in case of robot positioning errors. We tested 
this approach on the simulated iCub humanoid robot while performing a 
step recovery strategy. To summarize, the application presented in Chapter 5 
is placed amongst the trajectory generators which relies on external contact 
planners. Nevertheless, it interfaces directly to a whole-body controller 
without the need of a simplified model controller. 
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Chapter 6: On-line Predictive Planning for Walking of the iCub 
Robot 


This chapter presents an on-line predictive kinematic planner for foot-step 
positioning and center-of-mass (CoM) trajectories. This planner is also 
integrated with a stack-of-task torque controller, which ensures a degree of 
robot compliance. The entire architecture uses on-line feedback from the 
robot and user-desired quantities. It implements the above three layers, and 
can be launched on both position and torque controlled robots. 

The trajectory optimization for foot-step planning is achieved by a plan- 
ning module that uses a simplified kinematic robot model, namely, the 
unicycle model. Foot positions are updated depending on the robot state 
and on a desired direction coming from a joypad, which gives a human 
user teleoperating the robot the possibility of defining desired walking paths. 
Differently from [Faragasso et al., 2013], we do not fix a desired robot velocity. 
Compared to [Diedam et al., 2008, Missura and Behnke, 2014, Bombile and 
Billard, 2017], we do not assume the robot to be always in single support. As 
a consequence, the robot avoids to step in place continuously if the desired 
robot position does not change, or changes slowly. Another drawback of 
these approaches is that foot rotations are planned separately from linear 
positions, and this drawback is overcome by our approach. 

Once footsteps are defined, a MPC module generates kinematically feasi- 
ble trajectories for the robot center of mass and joint trajectories by using 
the LIP model [Kajita et al., 2003] and whole-body inverse kinematics. 

The CoM and foot trajectories are then streamed as desired values to 
either a joint-position control loop, or to a whole-body quadratic-programming 
(QP) controller. This latter controller generates desired joint torques, ensur- 
ing a degree of robustness and robot compliance. The desired joint torques 
are then stabilized by a low-level joint torque controller. Experimental val- 
idations of the proposed approach are carried out on the iCub humanoid 
robot, with both position and joint torque control experiments, highlighting 
the differences between these two modes. 


Chapter 7: Optimal Control of Large Step-Ups for the Atlas Robot 


In this chapter we leverage trajectory optimization techniques to generate 
motions which allow a humanoid robot to perform large step-ups. In particu- 
lar, we exploit a reduced model of the centroidal dynamics, where the forces 
on both feet are considered. The angular momentum is assumed constant, 
thus extending [Koolen et al., 2016b]. By adopting a phase-based trajectory 
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optimization technique [Winkler et al., 2018], we plan over a fixed series of 
contacts. Through a particular heuristic we avoid generating trajectories 
which would require a high torque expenditure. Experiments are performed 
on the Atlas humanoid robot, showing a reduction of the maximal knee 
torque up to 30% in simulation and up to 20% on the real hardware. 


4.2.2 Part III: 
Predictive Control Based on Complete Models 


In this part, we merge the first two layers, generating locomotion trajectories 
adopting the full kinematics of the robot and the centroidal dynamics. 


Chapter 8: Modeling of a Humanoid with Complementarity Con- 
ditions 


In this chapter we model the robot in contact with the environment similarly 
to [Dai et al., 2014]. In particular, we use the full kinematics of the robot 
and the centroidal dynamics but, in addition, we introduce new methods to 
define the complementarity constraints, i.e. those which link the velocity of 
a contact point and the force applied to it. In addition, we present a novel 
method to constrain their planar motion. This method, similarly to [Todorov, 
2011, Drumwright and Shell, 2010, Dai et al., 2014], does not consider a 
predefined contact sequence and does not need integer variables. 

Finally, the robot in contact with environment is modeled as a single, 
non-linear, dynamical system subject to equality and inequality constraints. 
This allows applying the optimal control techniques presented in Chapter 2. 


Chapter 9: The Whole-Body Non-Linear Predictive Controller 


We present here the set of tasks and constraints which can generate walking 
motions. Compared to [Dai et al., 2014] we do not assume to have a joint 
trajectory to be followed. On the contrary, such reference is fixed, while 
walking motions are generated by defining a constant desired CoM velocity 
and a moving reference on the ground. Given the high dimensionality of the 
system under control, several other tasks are included to “regularize” the 
generated walking motion. 

This chapter complements Chap. 8 in defining an optimal control problem. 
In particular, this chapter presents those tasks and constraints which are 
specific for the walking motions, thus not related to physical principles. 
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Chapter 10: Validation and Experimental Results 


In this chapter we validate the model and the tasks presented in the previous 
chapters. In particular, we first show an example of generated trajectory. 
Secondly, we evaluate the performances of the planner when changing the 
method for defining the complementarity constraints. 

Finally, similarly to [Carpentier et al., 2016], we generate off-line a walking 
trajectory including the desired joints value using the method described in 
the previous chapters. These are first interpolated and then fed to the robot 
by means of a joint position controller. 
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Part II 


Predictive Control Based on 
Simplified Models 


Keep on the lookout for novel ideas that others have used successfully. 
Your idea has to be original only in its adaptation to the problem you’re 
working on. 


Thomas Edison 
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A Push Recovery Strategy for the 
iCub Humanoid Robot 


Balancing and reacting to strong and unexpected pushes is a critical require- 
ment for humanoid robots. In this chapter, we use a predictive scheme for 
step recovery connected to a momentum-based whole-body torque controller, 
presented in Sec. 5.1. We use prediction to consider the switching of contact 
constraints induced by the step. Using this strategy, the robot is able to stand 
and maintain the upright position even after pushes of intensity comparable 
to a third of the robot weight. Sec. 5.2 presents the modeling tools used 
to achieve such a goal, while Sec. 5.3 defines the corresponding optimal 
control problem. The definition of references is introduced in Sec. 5.4. The 
prediction capabilities of the controller are also exploited to determine when 
the robot is about to fall and it is necessary to perform a step. This concept is 
presented in Sec. 5.5. Experiments in simulation, shown in Sec. 5.6, validate 
the proposed approach. Sec. 5.7 presents concluding remarks introducing 
pros and cons of the approach. 


5.1 Background on momentum-based whole-body 
torque control 

The push recovery strategy proposed in this chapter interfaces with the 

momentum-based whole-body torque control implemented for the iCub 


humanoid robot. In this section, we briefly outline it and we refer the reader 
to [Nori et al., 2015, Nava et al., 2016b] for additional details. 
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The momentum-based balancing controller is a hierarchical controller 
composed of two control objectives. The first, and highest priority objective, 
is the tracking of a desired centroidal momentum while the second is the 
stabilization of the zero dynamics. By assuming as virtual control inputs 


the contact wrenches f = [;f',--- ,,,£']', it is possible to control the robot 
momentum by solving the following minimization problem: 
. 2 
minimize leh — gh* 
subject to: (5.1) 
Gh := @X f+mg, 
Af<b, 


where the matrix @X represents the column concatenation of all the trans- 
formation matrices 6X k of Eq. (3.47). The inequality constraint Af < b 
represents friction cone, center of pressure and other constraints on the 
wrenches. The desired momentum rate of change is obtained by mean of a 
PI control law plus a feed-forward action [Nava et al., 2016b]. In particular, 
it allows tracking a desired linear CoM position and velocity. The angular 
momentum is controlled to zero, including some custom-made integral terms 
to avoid instability of the zero-dynamics. 

The second objective is responsible for constraining the joint variables and 
avoid internal divergent behaviors. As before, we can specify a minimization 
problem also for this second task, i.e. 


minimize ||7 — ||? (5.2a) 
subject to: M(q)v + C(q,v)v + G(q) — Je f= bea Ts, (5.2b) 


Jet + Jev = 0, (5.2c) 
2 
= solution of (5.1). (5.2d) 


Eq. (5.2b) corresponds to the free-floating dynamics of the mechanical 
system described in Sec. 3.4. Eq. (5.2c) is the constraint equation describing 
the kinematic constraints associated with the contacts. yw represents a joint 
torque reference computed as in a PD plus gravity and contact wrenches 
compensation controller. More details on this reference can be found in 
[Nava et al., 2016b]. Eq. (5.2d) is the hierarchical constraint, i.e. it prevents 
the solution of this second problem from changing the optimum of Eq. (5.1). 

Equations (5.2b) and (5.2c) together describes the dynamics of the con- 
strained dynamical system, as described in Sec. 3.4. Summarizing, from a 
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functional point of view, the momentum-based controller takes as reference 
a desired momentum trajectory, a desired joints configuration and the set 
of contact constraints. The generated torques are then applied directly as 
references for the low-level torque control. 

It is worth noting that when the constraint set changes, e.g. when the 
robot goes from two feet to one foot or vice versa, the constrained dynamics 
changes. The overall system is thus a hybrid system and the discrete state 
transitions should be handled accordingly. Indeed, the strategy presented in 
this chapter deals with this problem. 


5.2 Modeling for push recovery 


Let us consider the case where no contacts are available except the two 
applied at the feet. Then, we can rewrite Eq. 3.47 as: 


mils 0 ‘ T 
-| AR [ax @X"| ¢| +9 (5.3) 


CoM 
03x3 13 


ah” 


where the indexes / and r are relative to the left and right foot respectively. 

In order to model the step in a model predictive framework, we can 
assume to know the instant in which the swing foot touches the ground. 
In fact, the considered model does not contain any information about the 
posture of the robot and therefore it is not possible to define a “transition 
function”. For example, the distance of the foot from the ground can be 
used to predict when the step is going to take place, but this would require 
injecting also kinematic information in the planner. In other words, the 
controller is aware that the impact takes place in a precise instant in the 
future, but it does not know whether the quantities involved in the model 
affects this instant or not. Then, the most viable choice is to consider this 
instant constant and known in advance, named timpact and equal to the time 
needed to perform a step. This quantity is set by the user, also depending 
on the dynamic capabilities of the robot. 

For the same reason, the position that the swing foot takes after the step, 
is assumed to be known and constant too. This takes particular importance 
since @X* (where s refers to the swing foot) directly depends on this position. 
In Sec. 5.4 we describe how this position is obtained. Summarizing, the 
characteristics of the step, i.e. the duration and the target position, are 
assumed to be known and constant. 
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5.2.1 The angular momentum 


We focus now on the matrix @X°* (here i is a placeholder for subscript l or 
r), whose structure depends on the choice of the frame to describe the robot 
momentum. The matrix has the following form: 


13 03x3 
-Xi = À 5.4 
G le —2com) 13 | (5:4) 


with ;p € R? being the foot position. In fact, we assume contact wrenches 
to be measured in a frame located in ;p and oriented as the inertial frame 
T, thus parallel to G. Hence, the time derivative of the angular momentum 


corresponds to: 
lyr 


gh” = Sip — £com)* if +T. (5.5) 
(3 

The product between £com and ;f makes Eq. (5.5) bilinear (hence non- 
convex) with respect to these two variables. In the literature this problem 
is addressed by minimizing an upper-bound of the angular momentum [Dai 
and Tedrake, 2016], or by approximating it through quadratic constraints 
[Ponton et al., 2016]. In the application presented in this chapter, angular 
momentum is mainly needed to bound the usage of contact wrenches, rather 
than being precisely controlled to zero. Thus, we can accept a more coarse 
approximation, relying on its Taylor expansion. It is computed around 
the values obtained from the latest available feedback, indicated with the 
superscript 9. Such approximation has the following expression: 


{Lr} 


a oh dah” 
gh ah EDD G 


Oif 


dah” 

G ( 0 
ZLCoM — & ) 

9 Co o CoM } > 


(if - f°) F 


{Lr} 
gh? ~ > iT + (p- a on (ip - a, (if = if’) A 


i 


+ Can (zcom = went) 


where we exploited the anticommutative property of the cross product, i.e. 
x^y = —y^x. By simplifying, we finally obtain 


ah” x oe + (i — aha) if + (.#°)” (2com = zlom) - (5.6) 
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The approximation introduced with the truncation to the first order 
is o ((iF — if”) (com = sbam) )- In fact, there are no other quadratic or 


higher order terms. 


5.3 Optimal control problem definition 


First, let us define the state variables x and the control input variable f: 


TCoM f 
x:= |čcom|, f= PE x ER’, feR?”. (5.7) 


-hw di 
fel 


We can rewrite Eq. (5.3) using (5.6), obtaining: 


X= E.x+ f+ Q, + K°, (5.8) 
g | 03x3 13 0x3] 
E; = 03x3 P 03x3 03x31 , 
| (f° + f°) 03x3 03x3 
g 03x3 03x3 03x3 03x3 
F, = mts 03x3 mts 03x3] , 


(ip = ce 13 (rp — Powe 13 


x | 03 ae 06 


Here, K? introduces the constant terms resulting from the Taylor approxi- 
mation. Thus, it is dependent from the latest available feedback coming from 
the robot. Under the above considerations, the model described by Eq. (5.8) 
is affine and can be easily inserted into a QP formulation, as described with 
more details in Sec. 5.3.3. 


5.3.1 Contact constraints 


The constraints to be enforced are mainly related to the feasibility of the 
exerted wrenches. Furthermore the wrench acting on the swing foot should 
be null for every instant before the impact, i.e. timpact- 

We start by examining the constraints acting on the stance foot (defined 
with the subscript st): 


Ast stf < stb Vt: t < T, (5.9) 
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with T being the prediction horizon. Eq. (5.9) encompasses all the considered 
inequalities constraints, namely: i) friction cone with linear approximation, 
ii) center of pressure, iii) positivity of the normal contact force. 

The constraints on the swing foot instead, defined by the subscript sw, 
introduce the hybrid nature of the system. In particular, the wrench must 
be null before the impact and feasible after it. Formally: 


(5.10) 


swf = 06x1 Vt: t< timpact 
Asw swf < swb vt: timpact <t< T 


where Asw and swb describe the same feasibility constraints as A,; and stb. 
The above equation assumes that timpact < T. If the impact does not occur 
inside the control horizon, then the wrench exerted by the right foot is forced 
to be null throughout the whole horizon. 

We also included an additional constraint to enforce that balancing is 
kept after the step. In particular, after this instant, we can constrain the 
Capture Point xcp (introduced in Sec. 3.6.2) to be inside the convex hull of 
the two feet, which can be predicted by knowing the future position of the 
swing foot. Thus, we can define a set of linear inequalities such that if 

AchECP < chb, Vt: t > timpact, (5.11) 


is satisfied, than the Capture Point is inside the convex hull. Hence, we force 
it to be in a capturable state after the step is performed [Pratt et al., 2006, 
Koolen et al., 2012]. In fact, the convex hull represents the region where it is 
possible to stabilize the Capture Point dynamics without additional steps. 


5.3.2 Cost function definition 


We define here the cost function applied within the MPC controller, defined 
by J. Note that different terms of the cost function act only after the step 
is taken. It has the following expression: 


T T 
f=; (/ IxT) -xx Ollik ar+ f EGPA dr+ (5.12a) 
T 
+ / |x (7) — X" (7) |li dr+ (5.12b) 


imp 


H= X (T) Il cine ) (5.12c) 
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Since the impact may occur after the end of the prediction horizon, timp is 
the minimum between timpact and T. This allows us to weight the state x 
in a different way before (A) and after (KŻ?) the step. For the sake of 
simplicity, the initial time instant is set to zero. 

Finally, a terminal cost term is inserted (using the matrix Kerr intro- 
duced before) to model the finiteness of the control horizon. This is also 
useful in case timp = T. 

Consider now the final objective of having the center of mass over the 
centroid of the support polygon. We decide to weight only the normal 
component of the CoM error throughout the whole horizon, while weighting 
the transverse components (i.e. « and y) only in the weight matrix Ky”. 
Hence, they are considered only in the terminal cost, Eq.(5.12c), and after 
the step is made, i.e in the cost of Eq.(5.12b). 

Finally, the requested reaction forces are mildly weighted too. This term 
acts as a regularization, to prefer solutions requiring less stress to the robot. 

The outputs of the MPC strategy are used as references for the whole- 
body torque controller described in Section 5.1. In particular, we remove Eq. 
(5.2d) since the MPC takes care of computing feasible wrenches f. These 
are directly included in Eq. (5.2b). Desired joint positions are obtained by 
means of an inverse kinematics problem, i.e. we impose a desired pose for 
the swing foot, as defined in Sec. 5.4, following as close as possible the center 
of mass trajectory defined by the MPC controller. 


5.3.3 Quadratic programming transcription 


We solve the finite-horizon optimal control problem of Sec. 5.3 with the 
direct multiple shooting method presented in Sec. 2.3.2, i.e. we consider all 
the intermediate state variables. It is converted into a QP problem to be 
solved at each time step. 

We first discretize the model using a forward Euler scheme. Different 
approaches may have been chosen, as described in Sec. 2.3.1, but since we 
already adopted a first order Taylor approximation in the dynamics (see Sec. 
5.2.1), it would be inconvenient to use higher order methods. In addition, we 
use a fixed integration step dt of 10 milliseconds, which is small enough (given 
the system into consideration) to avoid numerical instability. By discretizing 
Eq.(5.8), we obtain: 


x(k +1) = Exx(k) + Fef(k) + G, + K9, (5.13) 
where E, = ly + dtE,, F, = dtF,, G, = dtG,, K? = dtK?, while k € N, 
k=0,:--, N — 1 is the discrete time, N = [ty/dt]. 
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According to the receding horizon principle, as presented in Sec. 2.4, at 
each time step, f(0) is applied to the system through a modified version of 
the balancing controller implemented on iCub, as anticipated in Sec. 5.3.2. 

The definition of the time instant at which the impact occurs is a crucial 
point to be considered. We assume the impact to be impulsive and taking 
place at the beginning of a time step. We denote this instant with kimpact. 
We further assume that the control input f(k) is applied piecewise constantly, 
i.e. constant from instant k tok+1. Note that, in view of the above two 
assumptions, setting Kimpact = 0 implies that both feet are in contact already 
at the initial time k = 0. 

As mentioned previously, we do not directly model the distance between 
the swing foot and the ground, but we compute the position and the expected 
impact time at the beginning of the push strategy. The following procedure 
describes how we update the expected impact instance throughout the 
proposed MPC controller. Assuming the initial time instant to be always 
set to zero, we start with a value of kimpact equal to [timpact/dt]. At each 
controller execution, we decrease it by one, i.e. kimpact = Kimpact — 1. If the 
impact has not occurred yet, we saturate kimpact to 1, i.e. we expect the 
impact to occur at the second time step. This avoid requiring wrenches on 
a swinging limb. If we measure a certain amount of force applied at the 
swing leg, we assume the impact to have occurred. In this case, kimpact iS 
set constant to 0. 


5.4 Definition of the swing foot reference position 


The output of the optimal control problem presented in Sec. 5.3 does not 
define the target position the swing foot should achieve. Nevertheless, such 
controller strongly depends on it. In this section, we present a simple heuristic 
we adopted to determine where the robot has to step [Dafarra et al., 2016]. 

One key characteristic is that the foot position is not planned a priori, but 
it depends on the status of the robot when the push-recovery controller starts. 
This allows the robot to take different steps depending on the direction and 
intensity of the push. 

The heuristic we use is based on the Capture Point concept described 
in Sec. 3.6.2. The solution to the differential equation of Eq. (3.68), with 
initial conditions in t = 0 is given by the following: 


acp(t) = e* (acp(0) — px) + pa, (5.14) 
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where pæ is the origin of the frame attached to the stance foot. This equation 
describes the time evolution of the Capture Point given its initial condition 
xop and the (constant) stance foot position x94. Knowing beforehand the 
time to perform the step tstep, the desired swing foot position swx* € R? is: 


evtstep (xcp(0) = pz) + p£ 


: (5.15) 


* 
swt = 


This quantity is kept constant for the whole prediction horizon. The orienta- 
tion of the foot is defined by the user instead. In other words, we land the 
swing foot on the Capture Point position we predict at end of the step. 

From the definition of the desired swing foot pose, it is also possible to 
compute a CoM position reference. In particular, for what concerns the x 
and y components, we choose it as the center of the convex hull given by the 
two feet. The desired height is kept equal to the initial one. The controller 
follows this reference only after the step is performed. During the swing 
instead, the controller does not follow any reference CoM trajectory, but it 
reaches this point thanks to its prediction capabilities. 


5.5 MPC as a step trigger 


Let us consider the scenario where the robot has to perform a step because of 
an external perturbation. In principle we could leverage the presented MPC 
formulation to define the moment in which to start the motion, up to now 
considered as a datum. This moment can be defined as the instant where the 
controller is not able to bring the CoM back to a desired equilibrium point, 
given the present support configuration. In other words, the constraints 
related to the balancing configuration prevent the controller to recover from 
the push. Thus, it is necessary to step and change balancing configuration 
in order to avoid a fall. The availability of a prediction horizon particularly 
suits this idea. Hypothetically, if T = oo, as soon as the robot is pushed, 
we could predict whether or not the controller will be able to absorb the 
disturbance completely. In practice this is not true. The finiteness of the 
prediction horizon hides the actual recovery capabilities of the controller, 
thus it is still necessary to define a heuristic. By considering the very last 
predicted state, we can set the following condition: 


|@com (T) — Bo ou|| + ku ||@com(T) — téoml| < d. (5.16) 
When Eq.(5.16) is violated, the robot performs the step. Both k, and d € R 


are user-defined parameters affecting the sensitivity to pushes. The smaller d 
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b) t= to + 1s c) t = to + 2s d) t = to + 2.5s 


Fig. 5.1 ier of the step recovery motion when the robot is pushed at 
45° with respect to the lateral axis. 


the more the robot will be inclined to take a step, while k, allows to regulate 
the relative importance of the two errors. Notice that Eq.(5.16) does not 
depend directly on the foot dimensions. Nonetheless, the heuristic depends 
on this quantity implicitly. A smaller foot would limit the capabilities of the 
MPC controller to steer the state towards its reference. 

In order to be effective, this heuristic needs the MPC strategy to be 
in charge of sending references to the robot, even in case there are no 
disturbances. However, the goal is different from what has been presented in 
Sec. 5.3.2. The robot should do its best to avoid a step. As a consequence, 
timp Will be set equal to T (i.e. the step will not occur at all), while Ky 
should be equal to KY”, to maintain the robot in its current position. 


5.6 Validation and simulation results 


The presented MPC approach has been tested in the Gazebo simulator 
[Koenig and Howard, 2004] by using the iCub model. The iCub humanoid 
robot, presented in Sec. 1.1, possesses 53 actuated joints, but only a total of 
23 degrees of freedom (DoF) are used for balancing tasks. In fact, we do not 
consider those located in the eyes and in the hands. 

Driven by the need of fast prototyping, the presented controller has been 
developed using the MATLAB® / Simulink® environment. For the presented 
tests, we use a machine running Ubuntu 16.04. The PC is equipped with a 
quad-core Intel® Core i5@2.30GHz and 16GB of RAM. MOSEK® is the 


‘https: //www. youtube .com/playlist?list=PLBOchT-u69w9hJ6BmqPf06r0zWmungOrc 
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Fig. 5.2 CoM evolution during three different experiments of push recovery. 
The single dashed lines in (a) and (c) show a not-enough-strong push to 
violate the condition in Eq. (5.16). The other two vertical dashed lines 
indicate the beginning and the end of the step (i.e. when the swing foot hits 
the ground). The external push force has been applied, with respect to the 
lateral axis, at 20°, —20° and 45° for the cases (a), (b) and (c) respectively. 
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(c) Forward step 


Fig. 5.2 (cont.) CoM evolution during three different experiments of push 
recovery. The single dashed lines in (a) and (c) show a not-enough-strong 
push to violate the condition in Eq. (5.16). The other two vertical dashed 
lines indicate the beginning and the end of the step (i.e. when the swing foot 
hits the ground). The external push force has been applied, with respect to 
the lateral axis, at 20°, —20° and 45° for the cases (a), (b) and (c) respectively. 
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Fig. 5.3 Tracking of the desired right foot position provided by the momentum 
controller introduced in Sec. 5.1. The vertical dotted lines have the same 
meaning of Fig. 5.2, namely they represent the beginning and the end of 
the step. The reference position is computed with the method presented in 
Sec. 5.4. It is computed when the step starts and it is tracked only after this 
moment. The tracking is satisfactory in (b), while it presents a large error 
on the «—direction in (a). In (c) there are non-negligible errors for both the 
x— and y— direction, probably because of a too far reference, that is hard 
to track during such dynamic motion. 
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Fig. 5.3 (cont.) Tracking of the desired right foot position provided by the 
momentum controller introduced in Sec. 5.1. The vertical dotted lines have 
the same meaning of Fig. 5.2, namely they represent the beginning and 
the end of the step. The reference position is computed with the method 
presented in Sec. 5.4. It is computed when the step starts and it is tracked 
only after this moment. The tracking is satisfactory in (b), while it presents 
a large error on the x—direction in (a). In (c) there are non-negligible errors 
for both the x— and y— direction, probably because of a too far reference, 
that is hard to track during such dynamic motion. 
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Fig. 5.4 (a) Vertical components of the contact forces during the side-step 
experiment. Measured forces are plotted with dashed lines, while desired 
forces with solid lines. (b) Position tracking error of the right knee joint. 
The knee absorbs the impact with the ground, as it can be noticed by the 
peak in the error right after the impact. In both the plots, the first vertical 
dashed line shows a not-enough-strong push to violate the condition in Eq. 
(5.16). The other two vertical dashed lines indicate the beginning and the 
end of the step. 
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selected solver, accessed through the MATLAB® interface CVX [CVX Re- 
search, 2012]. In order to test the presented MPC scheme, we set a simple 
stepping scenario, where the robot, balancing on its left foot, uses the right 
foot to take a step. This simple scenario allows us to test the performance 
of the proposed controller with a single contact activation. 

We present the results of different pushes, applied on the traverse plane, 
with an angle with respect to the lateral axis (pointing to the right of the 
robot), of 20° (Fig. 5.2a), —20° (Fig. 5.2b) and 45° (Fig. 5.2c). 

We choose a time step of 10ms, coincident with the rate of the whole-body 
torque controller, and a controller horizon of N = 25, totaling 250ms. We 
noticed that the chosen value of N is sufficient to allow the effectiveness of 
the strategy when the robot is pushed from different directions. The push 
is nearly impulsive, applied on the chest with a magnitude of around 100N, 
which is about one third of the robot weight force. Fig. 5.1 presents a series 
of snapshots showing the robot while performing a large step. 

Figure 5.2 shows the CoM evolution for the three experiments, i.e. with 
different directions of pushes. It can be noticed in both Figures 5.2a and 
5.2c that two pushes occur. The first one does not violate the condition 
in Eq. (5.16), thus it does not force the robot to take a step. The second 
one, instead, triggers a change in the support foot configurations, and as 
a consequence, a new desired configuration for the CoM. The CoM height 
does not reach its initial value due to the kinematic limitations of the robot. 
Indeed, as it can be noticed from Figure 5.1, legs are nearly fully extended. 

The desired right foot positions for each experiment are depicted in 
Fig. 5.3, together with the corresponding estimated planar coordinates. As 
introduced in Sec. 5.3.2, the desired right foot position is turned into a joint 
reference by means of inverse kinematics. Such reference is given as postural 
task to the momentum controller introduce in Sec. 5.1, which is considered 
a low priority objective. This is one of the main reasons behind the poor 
tracking performances. On the other hand, this also highlights the robustness 
properties of the presented architecture, which is able to maintain balance 
also in case of strong disturbances in the swing foot placement. Nevertheless, 
after the step is completed, the MPC strategy is fed with the measured foot 
position to be consistent with the new contact configuration. When the 
robot is pushed at an angle of 45°, the reference is very far for the robot 
to be achieved. As it can be seen in Fig. 5.3c, the foot does not reach its 
reference on the x-direction. Nevertheless, the controller is able to land the 
foot in a position which still allows it to maintain the balance. 

Figure 5.4a represents the tracking of the desired vertical forces output by 
the MPC controller for the side-push experiment. Remarkably, the normal 
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force on the right foot appears to be tracked also across the step. Figure 5.4b 
shows one of the benefits of torque control. The tracking of the joint position 
reference on the right knee undergoes a strong perturbation after the step. 
When hitting the ground, the intrinsic compliance introduced by torque 
control allows to absorb the impact, especially on this joint. This induce a 
peak of 30° of tracking error but the robot is still able to balance. In addition, 
torque control helps avoiding problems related to a not perfect placement 
of the swing foot before the impact. The compliance at the ankles helps 
absorbing the disturbances induced by an impact with a foot not perfectly 
parallel to the ground. 

Summarizing, the presented controller allows the robot to recover from 
pushes of various intensity and directions, while remaining able to perform 
involved step movements. 


5.7 Conclusions 


The proposed controller adopts an approximated model of the robot linear 
and angular momentum dynamics in a predictive framework. It allows taking 
into account step movements by varying the structure of constraints and cost 
functions across the change of contact configuration. The uncertainty on its 
actual time instant is considered by a “shift” in the prediction window. A 
heuristic is also employed as a condition for stepping. The contact wrenches 
are assumed to be control inputs and realized through a modified version of 
the iCub momentum-based whole-body torque controller. 

This approach avoids the definition of a CoM trajectory along the step, 
leaving such responsibility to the optimizer rather than to the designer. 
Currently, it takes almost 0.1 seconds to solve an instance of the presented 
formulation. Unfortunately, this prevented the application on the real robot. 
In the following chapter, this problem is solved by using more simplified 
models. Nevertheless, the interest on developing a planner able to consider 
both the dynamics and the kinematics of the robot has not vanished, leading 
to the results of Part III. 

In particular, in the following, we focus on the generation and control of 
walking trajectories. Hence, the foot position tracking is fundamental and 
it requires the development of a new whole-body controller. Results have 
also shown the importance of generating references which can be kinetically 
achieved by the robot. This topic is analyzed in more details in Part III. 
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On-line Predictive Planning for 


Walking of the iCub Robot 


We present here an architecture composed of three nested control loops. Sec. 
6.1 introduces the background concepts exploited in the control architecture 
presented in Sec. 6.2. The outer loop exploits a robot kinematic model to 
plan the footstep positions. In the mid layer, a predictive controller generates 
a CoM trajectory according to the well-known table-cart model. Through 
a whole-body inverse kinematics algorithm, we can define joint references 
for position controlled walking. The outcomes of these two loops are then 
interpreted as inputs of a stack-of-task QP-based torque controller, which 
represents the inner loop of the presented control architecture. This resulting 
architecture allows the robot to walk also in torque control, guaranteeing 
higher level of compliance. Real world experiments have been carried out on 
the humanoid robot iCub and they are shown in Sec. 6.3. Finally, Sec. 6.4 
concludes the chapter. 


6.1 Background 


For a certain number of applications, the terrain can be considered flat. In 
these cases, it is known that the human upper body is usually kept tangent 
to the walking path |Flavigne et al., 2010, Mombaur et al., 2010] all the more 
so because stepping aside, i.e. perpendicular to the path, is energetically 
inefficient [Handford and Srinivasan, 2014]. All these considerations suggest 
using a simple kinematic model to generate the walking trajectories: the 
unicycle model (see, e.g., [Morin and Samson, 2008]). This model can 
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O L, 
Fig. 6.1 Notation. The unicycle model is a planar model of a robot having 
two wheels placed at a distance 2my, my € R with a coinciding rotation 
axis. Hence, this mobile robot cannot move sideways, i.e. along the wheel 
axis, but it can turn by moving the wheel at different velocity. B is a frame 
attached to the robot whose origin is located in the middle of the wheels 
axis. Point F is attached to the robot. Its position expressed in B is given 
by d € R?, a constant vector. 


be used to plan footsteps in a corridor with turns and junctions using 
cameras |{Faragasso et al., 2013], or to perform evasive robot motions [Cognetti 
et al., 2016]. In all these cases, however, the unicycle velocity is kept to a 
constant value. 


6.1.1 Background on the unicycle model 


The unicycle model, represented in Fig. 6.1 is described by the following 
model equations [Pucci et al., 2013]: 


ty = vy Ro(Ou)er, (6.1a) 
ĝu = WŲ, (6.1b) 


with vy € R and wy €E R the robot’s rolling and rotational speed, respectively. 
zy € R? is the unicycle position in the xy plane of the inertial frame Z. 
Oy € R represents the angle around the z—axis of Z which aligns the inertial 
reference frame with a unicycle fixed frame. The variables vy and wy are 
considered as kinematic control inputs. 
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A reasonable control objective for this kind of model is to asymptotically 
stabilize the point F about a desired point F* whose position is defined as 
£p. For this purpose, we define the error £ as 


£:i= £F — TẸ, (6.2) 


so that the control objective is equivalent to the asymptotic stabilization of 
x to zero. Since we define £p as 


xr = ty + Ro(du)d, 
then by differentiation it yields the dynamics 
£F = £y + wy Ra (0y) Sod. (6.3) 


Eq. (6.3) describes the output dynamics. Substituting Eq. (6.1) into Eq. 
(6.3), we can rewrite the output dynamics as: 


TF = By(by)cu = [Rover Re(y) Sod CU (6.4a) 
= 1 —d2 
= R2(0y) f d CU (6.4b) 
where cy = [vy wy]! is the vector of control inputs. It can be noticed 


that det [Bu (6v)| = dı, which means that when the control point F is not 
located on the wheels’ axis, its stabilization to an arbitrary reference position 
x7, can be achieved by the use of simple feedback laws. For example, if we 
define the following control law 


cu = Bu (Ou) (tp — Kué) (6.5) 
with Ky a positive definite matrix, then we have our error dynamics 
x= —Kéz. (6.6) 
Thus, the origin of the error dynamics is an asymptotically stable equilibrium. 
Notice that this control law is not defined when dı = 0. 
6.1.2 Background on ZMP preview control 


The MPC controller implemented in this work has been derived from the 
Zero Moment Point (ZMP) preview control described in [Kajita et al., 2003], 
here introduced. This algorithm adopts a simplified model, i.e the Linear 
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Inverted Pendulum (LIP) model, introduced in Sec. 3.6.1. In particular, 
we adopt a slightly modified version, where the pæ is to the ZMP position 
[Vukobratovié and Borovac, 2004]. Nevertheless, the dynamic equations 
remain the same. 

The ZMP can be related to the CoM projection, called æzzp as in Sec. 
3.6.1, through the following equation: 


LIMP = LLIP — W ËLIP- (6.7) 


Assuming we have a desired ZMP trajectory 24,;p(t), we want to track 
this signal at every time instant. One possibility is to consider the ZMP as 
an output of the following dynamical system: 


X = Azur X + Bzmp u, 


(6.8) 
y = CMP X 
where the new state variable x and control u are defined as 
jeur] 
X:= |azprpl € R°, u := Ea (< R?. (6.9) 
[ër] 
The system matrices are defined as in the following: 
[0> 12 One Osx 
Azmp = |02x2 02x2 l2 |, Bzmp= Lo 
[02x2 02x2 02x2] (6.10) 
CZMP = [12 02x2 —w?12] ; 
We define a cost function J as follows: 
T j P 
ga |æžue — Czue x ep +lulh ar- (6.11) 


J penalizes the output tracking error plus a regularization term on the effort. 
The system described by Eq. (6.8) is linear, while the cost J is quadratic. 
This formulation provides the basis for the constrained MPC controller we 
present in Sec. 6.2.2. 
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Fig. 6.2 Representation of the sampling algorithm. The unicycle trajectory 
is initially discretized to obtain æy (kdt) (red dots), then the foot poses are 
sampled, obtaining ;p and p. 


6.2 Control architecture 


In this section we summarise the components constituting the presented 
architecture, namely: 


e the footstep planner, 
e the MPC controller, 
e the stack-of-tasks balancing controller. 


These components share a lot of commonalities with other state of the art 
approaches. Nevertheless, this section presents how these three components 
interconnect to define a walking architecture. In particular, we focus on all 
the aspects necessary for a simple walking task, starting from the user inputs, 
up to the references to the low-level robot controller, while dealing with its 
balance. Such architecture defines the main contribution of this chapter. 


6.2.1 The footstep planner 


Consider the unicycle model presented in Section 6.1.1. Assuming we know 
the reference trajectory for point F up to time T, we can integrate the closed- 
loop system described in Section 6.1.1 to obtain the trajectory spanned by 
the unicycle. The next step consists in discretizing the unicycle trajectories 
at a fixed rate 1/dt. This passage allows us to search for the best foot 
placement in a smaller space, constituted by the set of points obtained by 
discretization from the original unicycle trajectories. 
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To determine the foot placements, we assume them to be coincident with 
the corresponding unicycle’s wheel, as shown in Fig. 6.2. Hence, the desired 
planar position of the left and right foot!, ;p and „p, can be obtained as: 


ip = zy + Ro(9y) i ; (6.12a) 


rp = ty + Ro(Ou) E , (6.12b) 


where my is the distance of a wheel from the center of the unicycle, see Fig. 
6.1. The foot orientations in the xy plane, ;0 and +0, coincide with Oy. 

A step contains two phases: double support and single support. During 
double support, both robot feet are on the ground and, depending on the 
foot, we can distinguish two different states: switch-in if the foot is being 
loaded, switch-out otherwise. Instead, in single support, a foot is in a swing 
state if it is moving, stance otherwise. The instant in which a foot lands on 
the ground is called impact time, a (f is a placeholder for either l or 
r). After this event, the foot will experience the following sequence of states: 
switch-in — stance + switch-out + swing. This sequence is terminated by 
another impact time. At the beginning of the switch-out phase, the other foot 
has landed on the ground with an impact time a The step duration, 
A; is then defined as: 

A= t 


impact impact’ 


We define additional quantities relating the two feet when in double support 
(this phase is indicated with the ds subscript): 


e The orientation difference Ag = ||194s — -9as|\; 
e The feet distance Ap = ||1pas — rpas||; 


e The position of the left foot expressed on the frame rigidly attached to 
the right foot, "o;. 


These quantities will be used to determine the foot trajectories starting from 
the unicycle ones. 

Given the discretized unicycle trajectories, a possible policy to define 
footsteps consists in fixing the duration of a step Aj, or in fixing its length, 
Ap. Both these two strategies are not desirable. In the former case the robot 


‘With an abuse of notation, here we assume |p, rp € R? since we are just interested in 
the planar position. 
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take always steps which may be too short at maximum speed. In the latter, 
if the unicycle is advancing slowly (because of slow moving references), the 
robot will take steps always at maximum length sacrificing the walking speed. 
In view of these considerations, we would like the planner to modify step 
length and speed depending on the reference trajectory. To avoid fixing any 
variable, it is necessary to define a cost function. Our choice is composed of 
two parts. The first part weights the squared inverse of A;, thus penalizing 
fast steps, while the second penalize the squared 2-norm of Ap, avoiding to 
take long steps. Thus, the cost function J can be written as: 


1 
I = kiz + hello (6.13) 
where k; and kp are two positive numbers. Depending on their ratio, the 
robot will take either long-and-slow or short-and-fast steps. Notice that J is 
not defined when A; = 0, but the robot would not be able to take steps so 


fast. Thus, we need to bound A;: 
tmin < At < tmar, (6.14) 


where the upper bound avoids a step to be too slow. 

A, needs to be lower than an upper-bound dmar, bounding the swinging 
foot into a circular area drawn around the stance foot. Another constraint 
to be considered is the relative rotation of the two feet. In particular the 
absolute value of Ag must be lower than O0maz- 

Finally, to avoid the robot twisting its legs, we simply resort to a bound 
on the y—component of "o; vector. By applying a lower-bound (Wmin) on it, 
we avoid the left foot to be on the right of the other foot. 

Finally we can write the constraints defined above, together with 7, as 
an optimization problem: 


1 


minimize Ki- + Kyl|Apll? 
timpact A; 
subject to : (6.15) 
tmin < Ar S tmaz; 
Ap < dmar, 
Aol] < Omar, 


r 
Wmin S Oly. 


The decision variables are the impact times. If we select an instant k to 
be the impact time for a foot, then we can obtain the corresponding foot 
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pose, depending on the wheel position, see Eq. (6.12), and the corresponding 
unicycle orientation. The foot will keep this configuration until the beginning 
of the swing phase, where it moves to a new configuration. 

The optimization problem solution is obtained through a simple iterative 
algorithm. The initialization can be done by using the measured position 
of the feet. Starting from this configuration we can integrate the unicycle 
trajectories assuming to know the reference trajectories. Once we discretize 
them, we can iterate over the discrete instant k until we find a set of points 
which satisfy the conditions defined above. Among the remaining points 
we can evaluate J to find our optimum. This point corresponds to a new 
footstep, providing a new initialization for the iterative algorithm. An outer 
loop repeats this algorithm to obtain a series of footsteps. 

The algorithm described here is flexible enough to include some features 
which are useful when commanding a humanoid robot. For example, it is 
possible to avoid steps which would be too small, pausing the motion until 
a larger step can be made. Additionally, we can impose the robot to stop 
always in double support. 

Once footsteps are planned, we can proceed in interpolating the foot 
trajectories and other relevant quantities for the walking motion, e.g. the 
Zero Moment Point [Vukobratovié and Borovac, 2004]. This passage is 
fundamental to obtain trajectories that can be tracked by the following 
control layers. 

In addition, while walking, references may change and it is not desirable 
to wait the end of the planned trajectories for updating them. Thus, the idea 
is to merge two trajectories instead of serializing them. When generating 
a new trajectory, the robot is supposed to be standing on two feet and the 
first switch time needs to last half of its nominal duration. In view of these 
considerations, the middle instant of the double support phase is a suitable 
point to merge two trajectories. This choice eases the merging process since 
the feet are not supposed to move in that instant. Hence, the trajectories’ 
initialization is trivial. Notice that there may be different merge points 
along the trajectories, depending on the number of (full) double support 
phases. Two trivial merge points are the very first instant (full replacement 
of trajectories) and the final point (serialization case). From an intuitive 
point of view, this process may be seen as the interlocking of two jigsaw 
puzzles, as in Fig. 6.3. Each step represents a piece and different trajectories 
can be merged by interlocking them in the merge point, corresponding to 
the mid point of a double support phase. 

This method is similar to the Receding Horizon Principle described in 
Sec. 2.4. In fact, we plan trajectories for a horizon T but only a portion of 
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Time 
Fig. 6.3 Schematic representation of the merge point mechanism. A merge 
point represents the moment where it is possible to interconnect a new 
trajectory discarding the remaining part of the old trajectory. In particular, 
this correspond to the mid point of double support phases. 


them is actually used, namely the first generated step. This simple strategy 
allows us to change the unicycle reference trajectory on-line (through the use 
of a joystick for example), directly affecting the robot motion. In addition, 
we could correct the new trajectories with the actual position of the feet. 
This is particularly suitable when the foot placement is not perfect, as in 
torque-controlled walking. 


6.2.2 The model predictive controller 


The receding horizon controller used in our architecture inherits from the 
basic formulation described in Sec. 6.1.2. Nevertheless, differently from 
[Kajita et al., 2003, Wieber, 2006, Diedam et al., 2008], we have added time- 
varying constraints to bound the ZMP inside the convex hull, both during 
single and double support phases. In this way we increase the robustness of 
the controller. These constraints are modeled as linear inequalities acting on 
the state variables, i.e. 

Z(t)x < z(t). (6.16) 
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The constraint matrix Z and bound z depend on time. In fact, we can 
exploit the knowledge on the desired foot positions in time to constrain the 
ZMP throughout the whole horizon, while retaining linearity. In other words, 
we consider the convex hull as depending only on time, hence changing 
the constraints accordingly. This approach is similar to what presented in 
Chapter 5. In fact, we exploit the controller prediction capabilities to take 
into account changes in the support polygon before they are happening. 

To summarize, the full optimal control problem can be represented by 
the following minimization problem: 


T 
minimize | \|z Ze — Cumpx(|o +||u||% dr 
subject to: (6.17) 
X = Azgme X + Bzmp u, 
Z(t)x < z(t), vt € [0, T), 
x(0) = xo- 


The problem in Eq. (6.17) is solved at each control sampling time by 
means of a Direct Multiple Shooting method (as presented in Section 2.3.2), 
thus differently from [Dimitrov et al., 2008]. This choice has been driven by 
the necessity of formulating state constraints, i.e. Eq. (6.16), throughout the 
whole horizon. We apply the Receding Horizon Principle described in Sec. 
2.4, thus only the first output is used. Since the control input is the CoM 
jerk, we integrate it in order to obtain a desired CoM velocity and position. 
The latter is sent to the inverse kinematics (IK) library [Dynamic Interaction 
Control Lab - Istituto Italiano di Tecnologia, 2016, InverseKinematics sub- 
library], together with the desired foot positions to retrieve the desired joint 
coordinates. Both the MPC an the IK modules use Ipopt [Wächter and 
Biegler, 2006] to solve the corresponding optimization problems. 


6.2.3 The stack of tasks balancing controller 


The balancing controller stabilizes the center of mass position, the left and 
right foot positions and orientations by defining joint torques. The velocities 
associated to these tasks are stacked together into Y: 


E 
Y = [öm e ANE a (6.18) 


Vr, "VIr € Rê are the linear and angular foot velocities. 
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Letting Joom, Jı and J, denote respectively the Jacobians of the center 
of mass position, left and right foot configurations, Jy can be defined as a 
stack of the Jacobians associated to each task: 


z 
Jr= [Jd N I (6.19) 


Furthermore, the task velocities Y can be computed from the robot state 
v using Y = Jyv. By deriving this expression, the task acceleration is 


Y = Jyv + Jyt (6.20) 


In view of (3.37) and (6.20), the task accelerations Y can be formulated 
as a function of joint torques 7, and contact wrenches f, supposed to be 
control inputs: 


Ts +Jdf-Cv-G]. (6.21) 


n 


Y(T:, f) = Irv + Jr M7! ee 


In our formulation we want Y(r,,f) to track a specified task acceleration 


Y*, namely: 
te A= Y=: (6.22) 


The reference accelerations Y* are computed using a simple PD control 
strategy for what concerns the linear terms. In parallel, the rotational terms 
are obtained by adopting a PD controller on the SO(3) group, thus avoiding 
to use a parametrization like Euler angles. The implementation follows what 
is presented in [Olfati-Saber, 2001, Section 5.11.6, p.173]. 

While the task defined in Eq. (6.22) can be considered as high priority 
tasks, we may conceive other possible objectives to shape the robot motion. 
One example regards the orientation of a particular frame. For instance, we 
would like to keep the torso in an upright posture. Similarly to what have 
been done for the other tasks, we exploit the frame Jacobian Jame. Since 
this task is considered as low priority, we minimize the squared Euclidean 
distance of the frame acceleration from a desired value T € R3, ie 


eee |e ee ||? 
minimize 5||Firamer + JtameY — T 
Ts, 


(6.23) 


The reference acceleration J is obtained through a PD controller on S O(3). 
Finally, we also added another lower priority postural task to track joint 
configurations. Similarly as before, the joint accelerations 8(7, fk), which 
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depend upon the control inputs through the dynamics equations detailed in 
Eq. (3.37), are kept close to a reference &*: 


mune 58s, f)-—8 I’. (6.24) 


The postural desired accelerations are defined through a PD+gravity com- 
pensation control law, as in [Nava et al., 2016al. 

Considering the contact wrenches f as control inputs, we need to ensure 
their actual feasibility. To this end we apply the same constraints as in Sec. 
5.3.1. We can group Eq. (6.22) and (5.9) in the following QP formulation: 


minimize T (6.25a) 
Ts,f 
subject to : Y (rs, f) = Y* (6.25b) 
Ast f S stb, (6.25c) 
where the cost function 7 is defined as: 
1). sy w . ; wx |[2 wr 
R2 9 || §(7s, f) Ss I’ + > | Firamer + JtrameY — T F 5 IT|”, 


with wy, w, € R defining the relative priority of each task. With respect to 
Eq.s (6.23) and (6.24), an additional term is inserted, namely the 2-norm of 
the joint torques, which serves as a regularization. Among several feasible 
solutions, we are mostly interested in the one which requires the least effort. 

Let us focus on the foot contact wrenches ;f and „f. During the switch 
phases we expect one of the two wrenches to vanish in order to smoothly 
deactivate the corresponding contact. In order to ease this process we added 
a cost term equal to Sf (F;|\rfll + Fillrfil) where F, (respectively F;) is the 
normalized load that the right (respectively left) foot is supposed to carry. 
It is 1 when the corresponding foot should hold the full weight of the robot, 
0 when unloaded. This information is provided by the planner described in 
Sec 6.2.1. Hence, a high F, would penalize the use of ;f, and vice-versa. 

The QP problem of Eq. (6.25) is solved at a rate of 100Hz through 
qpOASES [Ferreau et al., 2014]. Even if this controller could run at a higher 
frequency, at the current stage 100Hz is the maximum rate at which we can 
set a torque reference to the robot. 

To summarize, this QP controller adopts a mix of tasks with either 
weighted or strict priorities. Indeed, it is very similar to those presented in 
Sec. 4.1.3. During experiments, we tested different solutions, but this proved 
to adapt well with the preceding control layers and the low-level controller 
of the iCub humanoid robot. 
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Fig. 6.4 A first skeleton of the architecture, composed only by the MPC (RH) 
controller connected to the inverse kinematics (IK). Their output is directly 
sent to the robot. 


6.3 Validation and experimental results 


The presented architecture is composed by three different parts. In this 
section, we show three experiments whose aim is to test each part in an 
incremental way. All the experiments are performed on the iCub robot, 
controlling 23 Dofs either in position or in torque control. 

The snapshots presented in Fig.s 6.6 and 6.8 are obtained from the 
video included in the playlist: https://www.youtube.com/playlist?list= 
PLBOchT-u69w9hJ6BmqPf06r0zWmungOrc. 


6.3.1 Test of the predictive controller in position control 


In the first experiment, we use the MPC module to control the robot joints 
in position control. As depicted in Figure 6.4, the control loop is closed on 
the CoM position only. In fact, we noticed that the noise introduced by joint 
velocity measurements can strongly affect the performance of the overall 
architecture. The reference foot poses x7, and the desired ZMP profile 
£Thyp are obtained through an off-line planner [Hu et al., 2016, Chapter 9]. 

Both the MPC and the IK are running on the iCub head, which is 
equipped with a 4t? generation Intel® Core i7@1.7GHz and 8GB of RAM. 
The whole architecture takes in average less than 8ms for a control loop. 

The robot is taking steps of 14cm long in 1.35s (of which 0.8s in double 
support). Fig. 6.6 presents some snapshots of the robot while walking. 

Fig. 6.5 presents the ZMP tracking performances of the MPC controller. 
The desired ZMP position is obtained as a set of straight lines connecting the 
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(a) Tracking of the desired ZMP on the zy plane. 
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(b) Tracking of the desired ZMP along the y direction in time. 


Fig. 6.5 Tracking of the desired ZMP trajectory by the MPC controller. It 
is possible to notice the overshoots in the y direction. Constraints in ZMP 
avoid this overshoot to make the ZMP exiting the support polygon. 
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(a) t= to (b) t= to + 1s (c) t = to + 2s (d) t = to +38 


Fig. 6.6 Snapshots of the robot while walking straight in position control. 


feet’s centers. It is possible to notice that the ZMP obtained from the MPC 
is smoother than the desired one. In addition, it presents some overshoots. 
The constraints applied on the ZMP position make sure that these overshoots 
are within the support polygon. 


6.3.2 Adding the unicycle planner 


Fig. 6.7 shows that the controller has been improved by connecting the 
unicycle planner described in Sec. 6.2.1. This allows us to adapt the robot 
walking direction in an online fashion. As an example, by using a joystick 
we can drive a reference position x7 for the point F attached to the unicycle. 
Depending on how much we press the joypad, this point moves further away 
from the robot, generating a faster walking motion. 

Using the planner described in Sec. 6.2.1, the step timings and dimensions 
are not fixed a priori. In this particular experiment, a single step could last 
between 1.35 and 5.0s, while the swing foot can land in a radius of 0.175m 
from the stance foot. Fig. 6.8 presents some snapshots of the robot while 
taking a right turn. 


6.3.3 Complete architecture 


Finally, we also plugged the task based balancing controller presented in Sec. 
6.2.3. The overall architecture is depicted in Figure 6.9, highlighting that the 
stack of task balancing controller is now in charge of sending commands the 
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Fig. 6.7 The scheme of Figure 6.4 has been upgraded with the unicycle 
planner which is responsible of providing online references. 


(a) t = to (b) t= to + 1s (c) t = to + 2s (d) t = to + 3s 


Fig. 6.8 With the unicycle planner, the robot advances while turning right. 


98 


Unicycle 
Planner 


Balancing 
Controller 


x 
TZMP 
CoM CoM Inverse 
— >| ; A 
MPC SSS Kinematics 
yk 
ZCoM 
ai 
e TCoM 
CoM 
Loom LCoM 
Forward 
Kinematics 


Ts 


Fig. 6.9 The complete architecture. The output of the IK is no longer sent 
to the robot, but to the stack-of-task balancing controller, together with the 
desired CoM and foot positions. Joint positions and velocities are taken as 
feedback from the robot. 
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Fig. 6.10 Center of Mass (CoM) tracking when taking some steps in torque 
control. The quantities are expressed in an inertial frame Z. 
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(b) Right foot 


Fig. 6.11 Tracking of the foot positions. After a step, the desired values 
are updated according to the measured position. This allows to cope with 
imprecise foot placements. This mechanism is the cause of the jumps on the 
desired values visible after the end of the step. 
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robot. In order to draw comparisons with the first experiments, we followed 
again a straight trajectory. Even if the task is similar, it is necessary to use 
the unicycle planner now in order to cope with foot positioning errors (see 
Fig. 6.11). In fact, trajectories can be updated in order to take into account 
possible foot misplacements. Fig. 6.10 shows the CoM tracking capabilities 
of the presented balancing controller. During this experiment, the minimum 
stepping time needed to be increased to 3s (the maximum is still 5s), while 
maintaining the same maximum step length of the previous experiment. 


6.3.4 Comparison and discussion 


Torque control experiments have shown some bottlenecks that prevented 
achieving higher performances. These issues are mainly related to the 
estimation of joints torques (limiting the performances of the joint torque 
controller) and to the floating base estimation. Regarding the first problem, 
the iCub humanoid robot does not have joint torque sensors, but it exploits 6 
F/T sensors to estimate these values [Fumagalli et al., 2012]. The estimations 
is provided at 100Hz, limiting the performances of the low-level torque 
controller. In addition, the F/T sensors are subject to noise and biases. This 
leads to a peak error in the torque tracking as high as 15Nm (on the hip roll 
joint) during the experiment showed in this chapter. To give a comparison, 
the maximum desired torque required on the same joint is 25Nm. 

Regarding the floating base estimation, we use a simple legged odom- 
etry algorithm where we keep track of the base pose only trough joint 
measurements. This algorithm causes unwanted discontinuities when the 
standing foot slightly rotates. This limitation affected the maximum velocity 
achievable by the robot. 

In view of these limitations, walking in position control strongly outper- 
forms the torque mode. The precise joint position tracking allows a more 
repeatable and faster walking. On the other hand, the robot is not able to 
adapt to any unevenness of the terrain, nor pushes. 


6.4 Conclusions 


This architecture is able to cope with variable walking speed, generating 
trajectories which don’t require the robot to step in place while coping with 
online changes of desired reference trajectories. The presented architecture 
is also flexible enough to allow controlling the robot by defining either joints 
positions or torques. 
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Compared to Chapter 5, this architecture uses more simplified models at 
the planning level. Nevertheless, in both cases the inner control loop uses the 
full robot model to compute the desired joint torques. The main difference 
between the controller presented in Sec. 6.2.3 and the one of Sec. 5.1 is that 
the latter tracks the foot reference poses indirectly, through the definition of 
desired joints position in a low-priority task. Instead, the one presented in 
Sec. 6.2.3 tracks a Cartesian reference for the feet as a high-priority task. 
Hence, the controller presented in this chapter is more precise and suitable 
to track a walking trajectory. 

The architecture presented in this chapter does not allow the robot to 
execute motions as wide as those shown in Sec. 5.6. In addition, it does not 
allow the adaptation of desired footsteps in case of external perturbation. 

In the next chapter, we still use a simplified model for the robot mo- 
mentum, but more detailed. We remove the assumption of constant height, 
planning dynamic step-ups for the IHMC Atlas. 
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Optimal Control of Large Step-Ups 
for the Atlas Robot 


In this chapter, we present a non-linear trajectory optimization method for 
generating step-up motions. We adopt a simplified model of the centroidal 
dynamics to generate feasible Center of Mass trajectories. We describe this 
model in Sec. 7.1. The activation and deactivation of contacts at both feet 
are considered explicitly, assuming a fixed sequence. A particular choice of 
cost function allows reducing the torques required for the step-up motion. 
Sec. 7.2 presents the trajectory generation problem and its transcription 
into an optimization problem. The output of the planner is a Center of Mass 
trajectory plus an optimal duration for each walking phase. These desired 
values are stabilized by a QP-based whole-body controller, introduced in Sec. 
7.1, which determines a set of desired joint torques. Experiments are carried 
on the Atlas humanoid robot, showing a reduction of the maximum leading 
knee torque of about 20%. They are presented in Sec. 7.3, while Sec. 7.4 
contains concluding remarks. 


7.1 Background 


7.1.1 The Atlas QP-based whole-body controller 


The IHMC Atlas humanoid robot is controlled by means of a QP-based 
whole-body controller [Koolen et al., 2016a], which is able to maintain the 
robot’s balance also in case the available contact patch reduces drastically 
[Wiedebach et al., 2016]. This controller is first introduced in Sec. 4.1.3 and 
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Fig. 7.1 Atlas performing a 31cm tall step-up. 


here presented with a much higher level of detail. This is to describe how 
the presented planner interfaces with the robot. 

The Atlas QP controller determines a set of desired joint accelerations, 
including the spatial acceleration of the floating-base joint with respect to 
T. We denote this quantity with te € R”+6 where n is the number of joints 
under control. In addition, the whole-body controller generates desired 
contact wrenches. The controller handles different motion tasks, which 
include Cartesian and center of mass (CoM) tasks. The former are achieved 
by minimizing the following quantity: 


where J, € R™*("+8) is the task Jacobian, with n being the number of 
degrees of freedom of task t. a* € R™ is the task desired acceleration, while 


2 
(7.1) 


Jie — (a* — jw) 


v are the measured base and joint velocities. The quantity (a* — iw) isa 


vector which does not depend on the decision variables ùe, and it is referred 
with the symbol 6r. 
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The CoM task is achieved by controlling the robot momentum gh € R6. 
In particular, we exploit Eq.s (3.47)-(3.48): 


Ne 
ah = Jommv + Jommy = mg + ye axel, (7.2) 

k=1 
where each contact wrench is supposed to be applied on a 2D patch. A 
different point force is considered applied at each vertex v of a contact patch 
k. In particular, these forces are defined through a coordinate vector o; € R”, 


whose basis are m extreme rays of the friction cone. By constraining each 
component of o; in the range Cag ge] , we avoid excessively high wrenches. 


More importantly, friction and Center of Pressure [Sardain and Bessonnet, 
2004] (CoP) constraints are implicitly satisfied [Pollard and Reitsma, 2001]. 
Trivially, also the resulting normal force is bounded to be positive. The 
corresponding wrench in the frame G, is expressed as a function of x, 0: 


oX* f= X. Qr, k0, (7.3) 


where Qg, € R®*"e depends on the position of the contact vertex v and on 
the basis vectors. By concatenating all the Qk, and x,,@, we obtain 


So @X"f= Qe. (7.4) 
k=1 


The Cartesian and CoM task are grouped as follows: 


J 
JcmmM 


SO OS 
JCMMY — gh* i 
with J and 8 being the concatenation of all the tasks Jacobians J; and 6; 
respectively. Gh* is a desired momentum rate of change. 
At every control cycle, the solved QP problem writes: 


minimize T= ||J% — Êl, + lele, + lleelle, 


Ve, Q 
subject to: Joma’: + Jommy = mg + Qe 
poe < o < pu. 


The desired joint accelerations and contact wrenches are used to compute a 
set of desired joint torques to be commanded to the robot. 
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Compared to the whole-body controller presented in Sec. 6.2.3, the one 
described here considers all the tasks at the same priority, relying to the 
weights in I to determine the relative importance of each task. This approach 
has the advantage of avoiding tasks infeasibility. In other words, the resulting 
optimization problem cannot result unfeasible because of not attainable refer- 
ences. On the other hand, tuning is more complicated. Another difference is 
represented by the use of the contact force parametrization through @, which 
avoids the insertion of inequalities like Eq. (6.25c). Finally, this controller 
outputs desired acceleration, while the one presented in Sec. 6.2.3 defines 
reference joint torques. 


7.1.2 The variable height double pendulum 


The linear momentum rate of change can be written according to Newton’s 
second law, similarly to what is done in Sec. 5.2: 


MECoM = `> kf + mg, (7.5) 
k 


where g € R? corresponds to the first three rows of g. pf € R? is an external 
force expressed in Z. We can express the wrenches applied at the feet in a 
frame parallel to Z, having the origin coincident with the corresponding CoP. 
We assume that no moment is exerted along the axis perpendicular to the 
foot, passing through the CoP. Thanks to these two choices, the moments 
are null. It is possible to enforce the angular momentum to be constant by 
constraining these forces along a line passing through the CoM, i.e.: 


of = màs (xcom = Zacor ) (7.6) 


The symbol o is used as a placeholder for either the left l or the right r 
foot. of € R? is the force applied on the foot, measured in Z coordinates. 
ào € R is a multiplier. Since the contacts are considered unilateral, Ao is 
constrained to be greater than zero for guaranteeing the positivity of normal 
forces. Facop € R is the position of the foot CoP expressed in Z. It is a 
function of the foot position op, its orientation, and of the CoP expressed in 
foot coordinates, o£CoP: 


t CoP = op +# Roo£CoP. (7.7) 


Notice that the z coordinate of o£cop is zero by construction. For the sake of 
simplicity, from now on we will drop the Z superscript in front of the rotation 
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matrix 7R,. For the same reason, we drop the subscript CoM from the 
quantity expressing the CoM position. This results in the following variation 
of notation: 


R, =? R, (7.8a) 
T := 06M (7.8b) 


Assuming the forces exerted on the feet to be the only external forces, 
we can rewrite Eq.(7.5) as: 


më = mg +m, (x — ıp — Ry 1Xcop) (7.9) 
+ mp (a —,p—R, r£CoP) - . 


Dividing by the total mass of the robot, we can finally obtain the equation 
of motion for the variable-height double pendulum: 


LH=g+A(x- ıp — Ri 1Lco 
g 1 ( Ip 1 1£CoP ) (7.10) 
+ Ap (£x -—,p— R, r&CoP) - 


This model have been studied and adopted both in the single contact [Koolen 
et al., 2016b, Posa et al., 2017, Caron and Kheddar, 2017, Caron and Mallein, 
2018] and multi-contact scenarios [Perrin et al., 2018], as in this chapter. 


7.2 Dynamic planning for large step-ups 


The step-up planner presented in this chapter leverages the idea of phase-based 
trajectory optimization [Winkler et al., 2018, Carpentier et al., 2016, Caron 
and Pham, 2017]. We assume to know the contact sequence beforehand, 
simplifying the handling of their activation and deactivation. In addition, we 
also assume the foot positions to be known in advance. Given the application, 
this represents a safety measure too. 

This section presents constraints, tasks and methodologies used to solve 
the corresponding non-linear trajectory optimization problem. 

We start by simplifying Eq. (7.10) considering a simple forced double 
integrator dynamics, as follows: 


a € R? assumes different values depending on the contact state of the 


corresponding phase 7: 
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e no contacts 


ai = g; 
e one foot in contact (o is either l or r) 
ai =g + ào (£ — op — Ro o€cop); 
e both feet in contact 
ai = g + à (x£ — ıp — Ri 1fcop) + Ar (£ — rp — Ry rXcop). 


We can obtain an approximated discrete form of Eq. (7.11) through a second 
order Taylor expansion: 


x(k +1) = x(k) + dt; v(k) + sat? ai(k), (7.12a) 
v(k + 1) = v(k) + dt; aj(k), (7.12b) 


where k € R is used to indicate a generic discrete instant, while v(k) € R3 is 
the CoM velocity at instant k. The duration T; € R of each phase 7 is not 
defined a-priori, but it is considered as an optimization variable. Nevertheless, 
each phase consists of a fixed number of instants. Hence, the integration 
step can be easily computed, i.e. dt; = T;/N, where N € Nt is the number 


of instants per phase. In addition, each T; is bounded in 7; min gmax, 


7.2.1 Force and leg constraints 


The forces applied at the feet need to satisfy some conditions in order to 
be attainable by the robot. These are embedded as constraints in the 
optimization problem. 

First of all, the CoP has to lie inside the foot polygon. This can be 
obtained by specifying a set of linear inequalities: 


A» o£ CoP < bo, (7.13) 


where the matrix A, € R®*? and the vector be € R” can be computed 
according to the foot location of and shape. In particular, we assume it has 
a polygonal shape with v vertices. Since o£cop is defined in foot coordinates, 
A, and b, do not depend on the foot pose. 

The contact forces are supposed to lie within the friction cone in order 
to avoid slipping. This can be achieved by imposing 


3J +SS? = hs ofe (7.14) 
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ls € R is the static friction coefficient and 2f € R? is the force exerted on 
a foot, expressed in foot coordinates, namely °f = R!.f. We assume the 
foot frame to be parallel to a frame attached to the ground, with the z— 
axis perpendicular to the walking surface. Using Eq. (7.6) and (7.7), we can 
rewrite Eq. (7.14) as follows: 


f 1 =p] (Rom. (x — op — Ro -@cop)) <0. (7.15) 


Notice that when A, = 0, friction is automatically satisfied as every compo- 
nent goes to zero. Thus, we can simplify Eq. (7.15): 


E 1 =p] (RI (x — op h cop) <0. (7.16) 


In order to consider the torsional friction, we impose the equivalent 
normal torque oTz, applied at the origin of the foot frame, to be bounded, i.e. 


— bt ofz < oTz < hto fz, (7.17) 


with u € R the torsional friction coefficient. Given that the external force 
of is applied in o£cop, we can rewrite this constraint as follows: 


T T T 
0 oXLCoP,y 0 
— |0 of < oLCoP,x of < | 0 of (7.18) 
Met 0 ht 
or equivalently 
-dief Ce < dief. (7.19) 


Notice that cl Sf corresponds to the z component of the cross product 
op x Sf. By substituting $f as for the static friction constraint, we can 
finally write the following set of constraints: 


F,R] (x = —p— Ro o€CoP) <0, (7.20) 


where 


n 
F, = ie = | . (7.21) 


—Co — ds 


When planning, we have to take into consideration the robot kinematic 
limits. We can approximate the robot leg length with the distance between 
the CoM and the corresponding foot position, thus limiting excessively wide 
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motions or movements which could cause a leg to collapse. This can be 
achieved with the following constraint: 


— om < || — op|| < ™™, (7.22) 


with (™™ max € R the lower and upper bound, respectively. These con- 
straints can be applied on each leg separately. 


7.2.2 Tasks 


The cost function is designed to generate a CoM trajectory for the robot to 
perform large step-ups, exploiting its momentum. It is composed of several 
tasks. The full cost function is shown at the top of Optimization Problem 1. 

The first task weights the distance of the terminal states from the desired 
position a* and velocity v*: 


Pas = we D> (lee) — 2*|? + lok) — v* |?) (7.23) 


k=Ky 


where ws» € Rt is a tunable gain. Ky contains a certain portion of the 
last phase, including the terminal state. Thus, it is possible to generate 
trajectories which reach the end point in advance while defining control 
inputs able to maintain such position. 

The model defined by Eq. (7.10) does not carry any information about 
the joint torques necessary to achieve the planned motion. Nevertheless, 
with the aim of reducing the torques required at the leading knee to step-up, 
we introduce the following task: 


T; = w (i(k)? + rt(k)?) + Wrmax (opra? + max a) ; 
where 


(7.24) 


ta — oPz — 06*) Ao(k), ifo in contact, 
ot(k) = 

0, otherwise, 

is used as a heuristic to reduce joint torques. .6* is a reference height 
difference. Intuitively, the robot knees undergo the highest stress when they 
are highly bent. In this configuration, the height difference between the CoM 
and the foot gets small. This heuristic prevents the solver from requiring 
high forces (characterized by a high multiplier A.) on a collapsed leg. Thus 
T- weights the summation over each time instant of the this quantity squared. 
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The second term is employed to prevent this quantity to have an impulsive 
behavior. w, € Rt and wz,,,, E€ Rt are weights allowing to specify the 
relative priority for each task. 

Considering the control inputs \.(k) and .&cop(k), it is preferable to 
generate trajectories which require small control variations. This reduces the 
need for large torque variations when tracking the trajectories on the real 
robot. To this end, we consider the following task: 


Ta,=wa, >, llou(k) —.u(k—1)I), (7.25) 
0,k=2...N-P 


where wa, € R* is the task weight and ou = [às othar] 


Finally, we consider a series of regularization terms: 


Lreg = wi) |E = Trall? + wu > louk). (7.26) 
i k,o 


This task allows selecting solutions which minimize the use of control inputs 
and the duration of each phase is close to a desired one T; d. Wt, Wu E€ R are 
the respective tasks weights. 


7.2.3 The optimization problem 


The trajectory optimization problem constituted by the dynamic constraint 
defined in Eq.(7.12), the constraints listed in Sec. 7.2.1 and the tasks of 
Sec. 7.2.2, is casted into an optimization problem via the Direct Multiple 
Shooting method, as described in Sec. 2.3.2. The optimization variables x 
correspond to the following set: 


x(k), k=0...N-P, 
v(k), k=0...N-P, 

_ jatk), k=1...N-P, 

SS VOY. BET RaSh, 
atesp(h); k=1...N; P, *=1, r, 
Ti T ER os 


with P € N* equal to the number of phases. x(k) and v(k) are state variables 
and need to be initialized with the measurements from the robot æo, vo: 


x(0) = zo, (7.27) 
v(0) = vo. (7.28) 
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(a) t = 3.38 (b) t = 4.28 (c) t = 5.68 (d) t = 8.88 


Fig. 7.2 Snapshots! of the step-up motion. These instances correspond to a 
switch of phases, except for the one at time 4.2s which corresponds to the 
lowest CoM height. 


Its complete formulation is shown in Optimization Problem 1. Note that 
index 7 refers to a specific phase. Hence, quantities that depend on it are 
fixed for its entire duration and considered as datum. It is implemented 
using the CasADi [Andersson et al., 2018] toolkit, where Ipopt [Wachter and 
Biegler, 2006] is selected as solver. 

The optimization problem is non-convex due to the non-linearities of the 
model equation, of the friction constraints and the chosen tasks. In order to 
facilitate the finding of a solution, we initialize the phases duration T; with 
their desired value. The CoM position trajectory is initialized to a linear 
interpolation from ap to a*. All other variables are initialized to zero. 

We noticed that it is particularly important to initialize at least the CoM 
height with a value different from zero. In fact, in case the CoM and the feet 
belongs to the same plane, the forces would vanish, see Eq. (7.6), and gravity 
would be impossible to compensate. Similarly, T; should not be initialized to 
zero to avoid numerical problems in Eq. (7.12). 


7.3 Validation and experimental results 


The trajectories generated with the method described in Sec. 7.2 are stabilized 
by the controller presented in Sec. 7.1.1. The desired accelerations and 


‘https: //www.youtube.com/playlist?list=PLBOchT-u69w9hJ6BmgPf06r0zWmungOrc 
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Optimization Problem 1: 


minimize [,« +r; +Pa, + Treg 
x 


subject to: 


end 


x(0) = x 


ge pL eT aR 

dt; = TN 

for k = ((i—1)P 
( 


ee — 1) do 

x(k +1) =a(k)+ i n + 5dt? a(k) 

v(k + 1) = v(k) + dt; a(k) 
if F in contact then 
a := a + Aj(k) (x(k) — 

Aj(i 2 < b(i) 
2 

[1 1 - u] RÌ (i) (x(k) )-Rı(i Jiæcop(k))) <0 

F,(i)R} (i) (x(k) — ıp(i) — a <0 
e 
end 
if right in contact then 
a i= a + A (k) («(k) — pli) — R, (i) £cop(k)) 
A,(i) r&Cop(k) < b, (i) 

2 

[11 = 42] (RTO (æ(k)=rpli)-R, i æcop(k))) <0 
F;()RT (i) (æ(k) — -p(8) - R, (i) -2oop(k)) < 0 
1™™ < ||æ(k) — rpl] < em 
end 
a(k)=a 


ipli) — Ri(i) @cop(k)) 


end 
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Fig. 7.3 Tracking of the CoM position along the planar directions. 
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Fig. 7.4 Tracking of the CoM height. The vertical dotted lines correspond to 
a change of phase. 
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Fig. 7.5 Tracking of the desired height velocity. The vertical dotted lines 
indicate the change of phases. It is possible to notice that the tracking 
worsens around t = 5s, which is close to the end of the double support phase. 
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Baseline 


— Step-up planner 


Knee torque (Nm) 


Time (s) 


Fig. 7.6 Comparison of measured knee torques. The baseline is obtained by 
letting the controller described in [Seyde et al., 2018] generating the motion 
automatically. The duration of each phase is generated by planner in both 
cases for easing the comparison. The dotted lines are the peak torques. The 
maximum torque required to the robot when using the step-up planner is 
20% lower than the baseline. 
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contact wrenches are turned into a set of desired joint torques commanded 
to the Atlas humanoid robot. 

The task consists in stepping onto a platform whose height is about 31cm, 
as showed in Figure 7.1. We noticed that this height is already enough to 
have the robot pushing the leg joint limits during the swing phase. The 
robot starts at a distance of about 30cm from the platform and performs a 
55cm step forward on the platform. Once the robot is close to the platform, 
the optimization is started using the current robot state as initialization. 
The optimization problem is solved using a desktop computer equipped with 
a 5’ generation Intel® Core i7@3.3GHz processor and 64GB RAM memory, 
running Ubuntu 16.04. Ipopt is set up to use the mumps [Amestoy et al., 
2000] solver. A solution is generated after about 1.5s. This time can be 
reduced up to a factor 3 by using the linear solver ma27 [HSL, 2007] instead 
of mumps. Nevertheless, considering that the trajectories are generated only 
once before the beginning of the motion, this improvement is not necessary 
at this stage. 

The length of each phase is set to N = 30 instants. We noticed that this 
value is a good trade-off between the computational time and the smoothness 
of the generated trajectories. The maximum duration of a phase is equal to 
2.38, thus corresponding to a maximum dt = 77ms. 

Amongst the set of variables x constituting the solution provided by 
the solver, we pack the CoM state into a desired trajectory. In addition, 
the timings T; are used to determine the single and double support phase 
durations. Fig. 7.3a and 7.3b show the CoM position tracking along the x 
and y direction. Figure 7.4 presents the tracking of the CoM height. Note 
that the CoM is lowered right before performing the step-up. This instant is 
shown in the snapshots of Fig. 7.2. We believe such desired motion arises 
to achieve sufficiently high vertical velocity, see Fig. 7.5, while considering 
the limits on the leg length. By performing this motion, the robot gains 
momentum, requiring a lower torque on the leading knee. 

In order to provide a comparison, we performed the same task without 
prescribing any CoM trajectory, thus adopting the motion generation tech- 
niques presented in [Seyde et al., 2018]. To facilitate the comparison, we 
impose the same phase durations. Fig. 7.6 shows the torque profile measured 
on the left knee when performing the step-up. We focus on this joint since is 
the one witnessing the highest torque expenditure during the step-up motion. 
In particular, using the controller presented in [Koolen et al., 2016b], the 
maximum value of this torque reaches 591Nm, while it is reduced to 478Nm 
when using the method presented in this chapter. This corresponds to a 
nearly 20% maximal torque reduction. Such result comes at the cost of a 
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— Step-up planner Baseline 


200 T 


Knee torque (Nm) 


Fig. 7.7 Leading knee torques when the static friction is reduced to 0.5. Here 
the reduction of the maximum value is about 10%. 


higher torque on the trailing knee. In order to facilitate the step-up motion, 
the right leg is much more exploited compared to the baseline, with a knee 
torque of nearly 430Nm. 

Since the trajectories are computed offline, small disturbances can induce 
the robot to a fall. Robustness is achieved by limiting the CoP to a smaller 
region compared to the real robot foot in the planning phase. In this way, 
there is margin in the QP controller for counteracting disturbances while 
executing those trajectories. During the experiment described above, the 
foot dimension is set to 30% of the original size. 

The planned step-up is a dynamic motion which requires a high vertical 
velocity of the CoM. We noticed that the static friction coefficient us has 
a particular effect in determining how “dynamic” is the planned motion. 
Intuitively, it limits the angle between the normal to the ground and the 
pendulum. A small angle corresponds to a conservative motion where the 
projection of the CoM on the ground lies well inside the support polygon. In 
the experiment shown above, the static friction coefficient is set to 0.7. By 
reducing it to 0.5 the motion is more conservative and robust, even though 
the effectiveness of the method is reduced to a maximum torque reduction 
of 10%. As shown in Fig. 7.7, the knee torque reaches a peak of 532Nm. 

When performing experiments on the real robot, the poor tracking of the 
desired trajectories limits the robustness and reproducibility of the results. 
Since the trajectories are computed offline, a poor tracking limits their 
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Fig. 7.7 Tracking of the CoM position during the simulation experiment. 
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Fig. 7.8 Comparison of measured knee torques during the simulation experi- 
ment. 
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efficacy. In the majority of the experiments, the reduction of the maximum 
torque consisted of about 10%. Anyhow, we believe this still represents an 
interesting achievement. It shows that trajectory optimization techniques 
have an effect in reducing the maximum effort required to a 150kg robot 
when performing a motion at the limits of its reachable workspace. 

When performing the same test in simulation, the tracking on the planar 
directions is much better, as shown in Fig. 7.8a and 7.8b, while along the z 
direction, it is somehow comparable, Fig. 7.7c. The improved tracking helps 
in maintaining balance, allowing more dynamic motions. The figures shown 
are generated with a static friction coefficient equal to 1.0, while the timings 
are kept equal to the real robot experiments. The baseline is obtained in the 
same way, leaving to the controller the generation of desired trajectories given 
a defined sequence of footsteps. As it can be noticed from Fig. 7.8, when 
the robot switches from double to single support the knee torque has a spike 
of 702Nm. Given the coincidence with the change of support configuration, 
the source of such spike is most probably the controller itself. Since, in 
simulation, the torque control is almost perfect, this spike is reflected also in 
the measured data. Conversely, on the real robot, any spike on the desired 
torques is smoothed due to the actuator dynamics. Nevertheless, when 
tracking the trajectories provided by the planner, there is no spike and the 
maximum torque is at 489Nm. 

The weights adopted in the cost function are the same in simulation and 
on the real robot experiments. The tuning process consists in adding one 
task at a time starting from the most important, i.e. T4, which is assigned 
an arbitrary weight. 


7.4 Conclusions 


This chapter presents a method to generate trajectories for large step-up 
motions with humanoid robots. Its effectiveness has been tested on the IHMC 
Atlas humanoid robot. Experiments carried on the real platform showed that 
such trajectories can reduce the maximum torque required to the knee up to 
20%. The method has been studied for large step-ups, but its applicability is 
not limited to this domain. Indeed, its formulation enables the planning of 
motions involving flight phases, hence requiring dynamic movements. This 
represents a fascinating future work. The desired trajectories are computed 
offline right before starting the step-up motion. This reduces the robustness 
of the planned trajectories, strongly relying on the performances of the 
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low-level controller. Poor tracking of the planned trajectories limits also 
their efficacy in reducing the maximum knee torque. 

Both Chapter 5 and 6 use predictive techniques which are successful only 
if the output is applied on the robot at every control iteration. Instead, in 
this chapter we use Multiple Shooting in a trajectory optimization framework. 
Since the trajectories are computed off-line, before performing the motion, the 
computational time is not a major obstacle. Indeed, this planner is the slowest 
compared to those presented in the previous two chapters. Nevertheless, 
the time horizon is also consistently larger. In addition, the optimization 
problem is non-linear, both in the costs and in the constraints. Thus, it 
cannot be reformulated as a QP problem, which is usually easier to solve. 

It is worth comparing the approach to angular momentum. The linear 
approximation adopted in Sec. 5.2.1 here is not applicable because of the 
much larger control horizon. As a consequence, we constrain its derivative 
to be null, similarly to the LIP model presented in Sec. 6.1.2. 

In the next part, we perform a complete modeling of the humanoid robot 
in contact with the environment. In particular, we consider the full robot 
kinematics and its centroidal dynamics, without any assumption neither on 
the CoM height, nor on the angular momentum. The interaction between the 
robot and the walking surface is modeled explicitly through different contact 
parametrizations. The approach does not need a predefined contact sequence. 
Nonetheless, walking patterns emerge automatically by specifying a minimal 
set of references, such as a constant desired Center of Mass velocity and a 
reference point on the ground. This new planner is conceived to be used at 
low frequency, allowing the insertion of another control layer. At the same 
time, it can be iterated to produce off-line trajectories in joint space, which 
are tested on the iCub humanoid robot. 
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Part III 


Predictive Control Based on 
Complete Models 


Your assumptions are your windows on the world. Scrub them off every once 
in a while, or the light won’t come in. 


Isaac Asimov 
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Modeling of a Humanoid with 
Complementarity Conditions 


In Part IT the focus is on the application of predictive controllers to locomotion 
problems. Approximations are introduced at various stages. In this part, we 
aim at adopting a model as detailed as possible. We start describing in this 
chapter the dynamical equations and the algebraic constraints which model 
the system under control, namely a legged robot instantiating contacts with 
the ground. It can be noticed that several inequality conditions are also 
employed. As a consequence, the resulting system is an inequality-constrained 
DAE, as introduced in Sec. 2.1. 

Sec. 8.1 introduce some preliminary concepts we adopt to describe a 
humanoid robot in contact with the environment. The definition of com- 
plementarity constraints, as introduced in Sec. 4.1.1, is fundamental and 
tackled in Sec. 8.2. The following sections are devoted to other aspects 
critical to define a meaningful model for whole-body trajectory generation. 
Finally, Sec. 8.5 presents the complete model. 


8.1 Preliminaries 


Humanoid robots are generally equipped with feet of finite size. The avail- 
ability of a flat surface in contact with the ground increases the capabilities 
of walking controllers to cope with external disturbances. On the other 
hand, when performing a step, the foot can impact the ground in a not flat 
configuration, reducing the amount of contact wrenches obtainable from the 
ground. Conversely, toe-off motions can be used to increase the work-space 
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available during double support phases [Griffin et al., 2018]. Given these 
reasons, all the various contact configurations should be taken into account 
when planning step motions. By considering the foot as a flat surface, it 
would be required to model how it can land on the walking surface, also 
limiting the set of possible wrenches depending on the situation. As an 
example, in case of line contact, no torque can be exerted along that line, as 
in a classical hinge mechanism. 

In order to reduce the complexity, it is possible to consider the foot as 
composed by a set of points, for example (but not limited to) four points 
located at the corners of the foot. As an example, this approach is adopted 
in [Wensing and Orin, 2013, Dai et al., 2014, Caron and Pham, 2017] and in 
Chapter 7. The advantage is that the several contact configurations can be 
modeled independently, depending on the number of points in contact. 

A pure force is supposed to be applied on each of the contact points. 
In case of four points, twelve variables are used to define a six dimensional 
quantity, i.e. the resulting contact wrench acting on the foot. This is a 
drawback that will be addressed later in Sec. 9.3.2. Appendix A is devoted 
to the computation of these forces starting from a generic wrench. 

Define ;p € R? as the i—th contact point location in an inertial frame Z, 
and ;f € R? as the force exerted on that point. Such force is expressed on a 
frame located in ;p and with orientation parallel to Z. We assume to have 
full control over the derivative of both these quantities: 


ip = Up, (8.1a) 

if = U;f; (8.1b) 

where u,p, u,¢ € R? are the control inputs for the i-th contact point. Since 

contact points are not supposed to penetrate the walking ground, we can 
impose h(;p) > 0. 

The force ;f results from the interaction of the contact point with the 


ground, hence it is limited. Being a reaction force, its normal component with 
respect to the walking ground is supposed to be non-negative. In particular, 


n(ip)| if > 0. (8.2) 


Additionally, we focus on the case where the contact force is not enough to 
overcome the static friction: 


ltp) fll < us np) if, (8.3) 


where us is the static friction coefficient. The meaning of functions h(-), n(-) 
and t(-) is available in Sec. 1.3. 
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For what concerns the robot control, we consider having full joint con- 
trollability. In particular, we assume the joint velocities as a control input: 


S = Us. (8.4) 


This assumption may seem unrealistic, but it allows for the insertion of 
an additional control loop. Joint velocities, together with contact vertex 
positions and forces, can be considered as a reference to one of the whole-body 
controllers presented in Sections 6.2.3 or 7.1.1. 

Finally, the base rotation included in q (see Sec. 3.4) is vectorized using 
the quaternion parametrization. The corresponding unitary quaternion is 
called pg € H. The base position is indicated with the symbol Fpp € Rè. 
The equations governing the dynamical evolution of the base are as follows: 


Tog =*Rp? 1,2, (8.5a) 
Tg = Up. (8.5b) 


Bur g € R, Up € R* and us € R” are control inputs, defining the base 


linear velocity, the quaternion derivative and the joints velocity, respectively. 
More specifically, ? vz,B is the linear part of 5 Vz, B € RÊ the left-trivialized 
(i.e. measured in body coordinates, see Sec. 3.2 for details) base velocity. 


8.2 Contact parametrization 


Given its reactive nature, the contact force ;f applied to the i—th contact 
point is supposed to be not-null only if the point is in contact with the walking 
surface. This condition could be represented by the following equality: 


h(yp) alip) if = 0. (8.6) 


Such a constraint can be difficult to tackle in an optimization framework. 
This is due to the fact that the feasible set is only constituted by two 
lines, namely h(;p) = 0 and n(;p)';f = 0, which are intersecting in the 
origin. In particular, at this point, the constraint Jacobian is singular, thus 
violating the linear independence constraint qualification (LICQ), on which 
most off-the-shelf solvers rely upon, Sec. 2.5. 

We present three approaches aimed at mitigating this problem: 


e Relaxed complementarity; 
e Dynamically enforced complementarity; 


e Hyperbolic secant in control bounds. 
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8.2.1 Relaxed complementarity 


The condition defined in Eq. (8.6) can be relaxed by noticing that both h(;p) 
and n(;p)';f are positive quantities. Hence, instead of using an equality 
condition, we can upper-bound their product with a small positive constant 
€ ER: 

hip) nip) if < €. (8.7) 


Through this simple modification, the feasibility region increases in dimen- 
sion. In addition, if the inequality constraint is active at the optimum, the 
Jacobian is no longer singular. This approach is pretty common in literature, 
corresponding to the use of “bounded” slack variables [Betts, 2010]. 


8.2.2 Dynamically enforced complementarity 


The complementarity constraints are algebraic conditions applied to the 
dynamical system described in Eq. (8.1), thus they can be enforced using a 
Baumgarte stabilization method [Baumgarte, 1972]. In particular, we impose 
the complementarity constraints to converge dynamically to zero. This result 
is achieved by setting the constraint dynamics as follows: 


d 
a (hip) n(ip) if) = —Kps (nap) n(ip)" if) (8.8) 
with Kps € Rt a positive gain. Hence, the product h(;p) n(ip)'if would 
exponentially decrease to zero at a rate dependent on Kps. We can expand 
the time derivative as follows: 


(>) = < (A(ip)) nip) if thh): fT E (n(ip)) +h(ip) nip) if, (8.9) 


where we exploited the fact that Eq. (8.7) is scalar to ease the computation 
of each term. We can substitute the time derivative of the h(-) and n(-) 
functions with the following relations: 


i (hip) = + (hGp)) iB, (8.10a) 
< (n(ip)) = = (n(;p)) ip. (8.10b) 


For simplicity, let us define ¢ as follows: 


Ge 5 (AGP) eB nap) if + MPF 5 (ap) p= hp) nee) a. 
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Finally, we obtain the condition 


C = -Krs (hap) nlp)": f) - (8.11) 
In case of planar ground, we have the following relations: 
h(ip) = €3 iP, (8.12a) 
= (h(Gip)) =e3, (8.12b) 
n(ip) = e3, (8.12c) 
Ip (n(ip)) = 03x3. (8.12d) 


In fact, in this specific case, the normal direction and the height of the point 
coincide with the z-direction and the corresponding coordinate, respectively. 
Hence, in the planar case, ¢ would reduce to 


Goianar = iDz i ifa + iPz': ifa (8.13) 


We can relax Eq. (8.11) by exploiting again the fact that the product 
h(ip) nlip)l if is positive by construction. Henceforth, if we impose 


C < -Kis (hap) n(ip)' if) ; (8.14) 


we still have exponential convergence, at a rate which is higher or equal to 
the one specified by Kps. Finally, similarly to Eq. (8.7), we add a further 
relaxation through a positive number € € R to increase the feasibility region, 


C < -Kis (nip) n(ip)" if) tE (8.15) 
obtaining the final version of the complementarity condition. 


8.2.3 Hyperbolic secant in control bounds 


We can impose Eq. (8.6) dynamically by enforcing the following set of 
constraints on the control input: 


meaning that when the point is in contact, u, p is free to take any value in 
[—My, Mz] with My; € R? a (non-negative) control bound. On the other 
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Fig. 8.1 The plot of sech(102). 


hand, if the contact point is not on the walking surface, the control input 
makes the contact force decreasing exponentially (Eq. (8.16b)) at a rate 
depending on the positive definite control gain Ky € R?*3. Defining ô*(:p) 
as a binary function such that 


O*(Gp) = Gp) (8.17) 
0 hip) #0, 
it is possible to write Eq. (8.16) as a set of two inequalities: 
= K; (1 = 6*(ip)) if to (ip) My = U,f- (8.18b) 


Even if 6*(;p) would require the adoption of integer variables, it is possible 
to use a continuous approximation, 6(;p), namely the hyperbolic secant, an 
example of which is shown in Fig. 8.1: 


(sp) = sech (ky, h(ip)) , (8.19) 


where kp is a user-defined scaling factor. Notice that, when 6*(;p) = 0, the 
bounds coincide and are equal to —Kf;f. 

As discussed in Sec. 8.2.2, we can simplify the lower bound defined in Eq. 
(8.18a), allowing the force to decrease at a higher rate than the one given by 
Eq. (8.18b). Hence, we can rewrite Eq. (8.18) as: 


— My <u, < -K4 (1 — ôlip)) if + 6(ep) My. (8.20) 
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Given Eq. (8.3), it is enough to apply any of these equations only to the 
force component normal to the ground: if it decreases to zero, also planar 
components have to vanish to satisfy friction constraints. Hence, we can 
simplify Eq. (8.20) as follows: 


—e3 Mj < e3 up < -Kp (1 — ôlip)) nip) if + O(ep)e3 My, (8-21) 


with Kẹ- the corresponding element of Kẹ. 


8.2.4 Summing up 


We analyze different methods for expressing the complementarity constraints 
defined in Eq. (8.6). In particular, we have: 


e hip) np) if <6 
e C < -Kos (AGP) nip) if) +8; 
e —e; M; < ej u,f < -Kp (1— ôlip)) n(ip) if +êlipje] My. 


It is also realistic to assume the control inputs to be bounded, i.e. —My < 
u,f < My, while this is not necessary when using Eq. (8.18b) since bounds 
are already included in the constraint. 

Note that these conditions do not depend on the type of ground, which 
is considered rigid. The parameters involved do not have a direct physi- 
cal meaning (like in compliant contact models), but rather determine the 
“accuracy” of the simulated behavior. 


8.3 Prevention of unrealizable motions for the con- 
tact points 


The complementarity conditions presented above cannot prevent the contact 
points to move on the walking plane when in contact. In fact, even if friction 
constraints defined in Eq. (8.3) are satisfied, the contact points are still free 
to move on the walking surface. Force and position variables are until now 
almost independent. It is possible to prevent planar motions when in contact 
by limiting the effect of the control input u,, along the planar components: 


thp) lip = tanh (ki A(:p)) [e1 e2]' up (8.22) 


where k; € R is a user-defined scaling factor. Eq. (8.22) multiplies the 
control input along the planar direction to zero when h(;p) is null and, at 
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the same time, it reduces the velocity when the contact point is approaching 
the ground. It is possible to rewrite Eq. (8.22) as ip = T(ip)u,p, where 


tanh (ky h(ip)) 
T(ip) =7Rplane diag | | tanh (ky h(sp))| |. (8.23) 
1 


Note that, from now on, wu,» is assumed to be defined in plane coordinates. 
Thus, the normal component of the velocity is directly affected by e3 Up. 
Also, it is necessary to bound this control input, up € |- Myv, My], My € 
R3, to properly exploit the effect of the hyperbolic tangent. 

While each contact point is supposed to be independent from the control 
point of view, they all need to remain on the same surface and maintain a 
constant relative distance, since they belong to the same rigid body. At the 
same time, we want them to be within the workspace reachable by the robot 
legs. We can achieve both objectives with the following algebraic condition 
acting on each of the contact points: 


ip = T ot ip, (8.24) 


where §°t;p is the (fixed) position of the contact point within the foot surface, 
expressed in foot coordinates. Here, the transformation matrix + Hfoot would 
depend on the base position fpg, the base quaternion 7pg and the joint 
configuration s. As a consequence, the full kinematics of the robot is taken 
into consideration. 


8.4 Momentum dynamics 


In Sec. 8.1, we consider the contact points as if they have the possibility of 
exerting a force with the environment. Here, we describe the effect of this 
forces on the robot through the centroidal dynamics introduced in Sec. 3.5. 
This choice is supported by the fact that the momentum dynamics depend 
only on the contact forces, their location and on the CoM position: 


gh =mg+) > aX’ g =mġĝ+)_ 


Compared to Eq. (3.47), we simplify the matrix @X* since no torque 
is applied at the contact points. We also need to make sure that the CoM 


13 
(ip = LComM) 


| T. (8.25) 
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position obtained by integrating Eq. (3.43) corresponds to the one obtained 
via the joints variables. This is done through the following algebraic equation: 


Com = CoM( pg, Ipp, s) (8.26) 


where CoM(‘pzg,Zpz,s) is the function mapping base pose and joints posi- 
tion to the CoM position, i.e. the right hand side of Eq. (3.41). While this 
equation defines a link between the linear momentum and the joint variables, 
the same would not hold for the angular part. In other words, we need to 
model how the angular momentum affects the evolution of joints. To this 
end, we can exploit the Centroidal Momentum Matrix Jemm. In fact, we 
can extract the angular part of Eq. (3.46), obtaining the following condition: 


Gh® = [03x3 13] Jommv. (8.27) 


Here, the system velocity v contains the base angular velocity ? wp. It 
can be substituted with the quaternion derivative through the map G. It is 
defined in [Graf, 2008, Section 1.5.4] as 


—Topa ToBo ToB3 Topo 
G(Zps) = |—7ep2 —*pp3 TeBpo *pBa (8.28) 
Tops tTpp2 —ppa TpBo 
such that 
Purp = 2G(*pp)up. (8.29) 


Hence, it depends on the same control input introduced in Eq. (8.5b). 


8.5 The complete differential-algebraic system of 
equations 


By summarizing all the ODEs and algebraic conditions introduced in this 
chapter, we obtain the following inequality constrained DAE. 
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e Dynamical Constraints 


i Uf; V contact point, 
ip =T(ip)U,p, V contact point, 


eh = se N 


1 
LCoM = ai (gh?) , 


(ip — oe ad if, 


F Tp- B 
ps = RB” VIB, 
FA é 
PB = Up, 
S = Us. 


e Equality Constraints 


ip = Hoo ip, V contact point, 
ZLCoM = CoM(*pp,* pp, 8), 
Purp 
Gh” = [03x3 13] Jomm |2G(7pp)up| , 
Us 
l Psl? =1 


e Inequality Constraints, applied for each contact point 


n(ip) if > 0, 
ltp) if ll < us alip) if, 
— My <u,» < My, 
— My <u, < My, 
h(ip) = 0, 


Complementarity, see Sec. 8.2.4. 
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The Whole-Body Non-Linear 
Predictive Controller 


Given the model presented in Chapter 8, we defin[] in Sec. 9.1 a set of 
additional constraints, while, in the following sections, we describe a set of 
tasks aimed at the generation of walking trajectories. They compose the 
full optimal control problem presented in Sec. 9.4. More specifically, the 
additional constraints presented in this chapter are inequalities which prevent 
the generation of undesirable or unfeasible motions for the robot. On the 
other hand, the tasks are the mathematical description of the goals we want 
to achieve when planning for the walking motions. Finally, Sec. 9.5 presents 
the software infrastructure used to solve the optimal control problem. 


9.1 Walking specific constraints 


While taking steps, we need to make sure that the robot legs do not collide 
with each other. Self collisions constraints are usually hard to consider and 
may slow down consistently the determination of a solution. A simpler 
solution to avoid self-collisions between legs consists of avoiding the left one 
to be on the right of the other leg, even if cross steps are then forbidden. We 
assume the frame attached to the right foot to have the positive y—direction 
pointing toward left. In this case, it would be sufficient to impose the 
y—component of the "a, (i.e. the relative position of the left foot expressed 
in the right foot frame) to be greater than a given quantity, i.e.: 


esx, > iain’ (9.1) 
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In other cases, we want to prevent the solver to find a solution requiring 
the robot to raise the swing leg too much. Once the leg is raised, there are 
infinite motions which generate the same centroidal momentum. Too wide 
motions of the swing leg may cause other self-collision, especially between 
the arms and the thigh. Hence, we set an upper-bound on the difference 
between the height of the two feet. In particular, we can consider the mean 
position of every contact point to simplify the definition of the constraint: 


— Mns < e3 (Pi — #Pr) < Maz (9.2) 
where pı and 4p, are the mean positions of all the contact points of the 
left and right foot respectively, i.e. po = oe iPo- Np is the number of 


contact points in a single foot and Maf € RY is the constraint upper-bound. 
Some additional constraints can be considered: 


ZCoM,z min < €3 LCoM; (9.3a) 
~ My, < ah” < Mn.. (9.3b) 


Eq (9.3a) avoids picking solutions which would bring the CoM position too 
close or even below the ground. Eq. (9.3b) set a bound Mp, € R3 to the 
angular momentum. These constraints avoid considering trajectories that 
would cause excessive motions or let the robot falling. 


9.2 Tasks in Cartesian space 


In order to make the robot move toward a desired position, it is necessary to 
specify tasks in Cartesian space. 


9.2.1 Contact point centroid position task 


We define as a task the L2 norm of the error between a point attached to 
the robot and its desired position in an absolute frame. Suppose we choose 
the CoM position as a target point. By moving its desired value forward in 
space, the robot could simply lean toward the goal without moving the feet. 
This undesired behavior may lead the robot to fall. It is possible to avoid 
the robot leaning forward by locating the target point on the feet instead of 
the CoM. In particular, we select the centroid of the contact points as target, 
thus avoiding specifying a desired placement for each foot: 


1 * 
Lyr = slap — P liwe (9.4) 


where yp = 3(4p1 + pr) and yp* € R? is its desired value. 
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9.2.2 CoM linear velocity task 


While walking, we want the robot to keep a constant forward motion. In 
fact, since foot positions are not scripted, it may be possible to plan two 
consecutive steps with the same foot. Requiring a constant forward velocity 
can help avoiding such phenomena. This task can be defined as: 


1 1 
Done = allah? - m&eoullw, (9.5) 


with £ģoņ a desired CoM velocity. It is possible to select and weigh the 
different directions separately through the weights Wy. 


9.2.3 Foot yaw task 


The task on the centroid of the contact points, defined in Sec. 9.2.1 allows 
defining the direction the robot has to step. With this task, we specify at 
which angle the foot should be oriented with respect to the z-axis, in brief 
the foot yaw angle. Define y as the desired yaw angle for either the left or 
the right foot (o is a placeholder). We construct a unitary vector & € R? 
belonging to the xy-plane (of T), oriented such that the angle with the x-axis 
of Z corresponds to 73. Its components are: 


f= Fae . (9.6) 


sin(y3) 


Similarly, the vector £, € R? is fixed to the foot and parallel to the foot 
x-axis. This vector can be easily obtained as a linear combination of the 
contact points position. The goal of this task is to align £, to €3. This can 
be achieved by minimizing their cross-product, which corresponds to the 
following task: 


2 


Ea ; | [- sin) eos(72)] £0 (9.7) 


lyr 


Notice that Eq. (9.7) has a minimum also when &, is null. In other words, 
a can be minimized by shrinking the projection on the xy-plane of the 
vector attached to the robot foot. This is undesired because it would set the 
foot to be perpendicular to the ground. Hence, we consider also a second 


vector attached to the foot and perpendicular to £, called £+. This vector 
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is parallel, and have the same direction of the foot y-axis. Hence, the final 
task has the following form: 


Dyan =J [FO cost) & l 


lr 
1 
P 
lr 


Eq. (9.8) does not impede the foot roll and pitch motions during swing. 


: (9.8) 


Jeos(73) sin(5)| eż 


9.3 Regularization tasks 


The dynamical system of Eq. (8.5) depends on a high number of variables. 
Despite the additional constraints of Sec. 9.1 and the Cartesian tasks of Sec. 
9.2, a consistent part of the dynamics is not taken into consideration nor 
constrained. For this reason, it is necessary to introduce regularization tasks 
which contribute in generating walking trajectories. 


9.3.1 Frame orientation task 


While moving, we want a robot frame to be oriented in a specific orientation 


z frame: We thus need to define an orientation error measure. In particular, 


we weight the distance of the rotation matrix 7Rgame = TRE! a Risme 
from the identity. Having to express this task in vector form, we can convert 
T Re-ame into a quaternion (through a function quat, which implements the 
Rodrigues formula, see Eq. (B.1)) and weight its difference from the identity 


quaternion I,. Namely: 


2 


quat (“Rirame) -Å (9.9) 


1 
Trame = 9 


This task can be applied on multiple bodies, like the robot torso and waist. 


9.3.2 Force regularization task 


While considering each single contact force in a foot as independent, they 
still belong to a single body part. Thus, we prescribe the contact forces in a 
foot to be as similar as possible, refraining from using partial contacts if not 
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strictly necessary. This can be obtained through the following: 


2 
Np 
Tregt = n if —diag(ia*) X af (9.10) 
i 7 We 

Here ;a* € R? determines the desired ratio for force i with respect to the 
total force. For example, if we want all the forces in a foot to be equal, it is 
sufficient to select all the components of ;a* € R? equal to =. In this case, 
the corresponding CoP is the centroid 4p. In other cases, if may be helpful 
to move the CoP somewhere else in the foot. In this case is sufficient to 
compute the corresponding ;a*, for example by using the technique presented 
in Appendix A. 


9.3.3 Joint regularization task 


In order to avoid the planner to provide solutions which require huge joint 
variations, we can introduce a regularization task for the joint variables: 


1 
Lee = 5 |i + .K.(s — s*) 2 


lw, (9.11) 


with s* a desired joints configuration and W; a weight matrix. The minimum 
for this cost is achieved when s = —K;,(s — s*), with K, € R”*””, Ks = 0. 
When this equality holds, joint values converge exponentially to their desired 
values s*. In this way, both joint velocities and joint positions are regularized. 


9.3.4 Swing height task 


When performing a step, the swing foot clearance usually ensures some level 
of robustness with respect to ground asperity. Nevertheless, since the soil 
profile is supposed to be known in advance, a solution satisfying all the 
equations described in Chap. 8 may require the swing foot to be raised just 
few millimeters from the ground. In order to specify a desired swing height, 
we can adopt the P task: 


ie Er = (egip h) fer ex)" up 


It penalizes the Tei between the z-component of each contact point 
position from a desired height ¿sh* € R when the corresponding planar velocity 
is not null. Trivially, this cost has two minima: when the planar velocity is 
zero (thus the point is not moving) or when the height of the points is equal 
to the desired one. 


2 
| l (9.12) 


139 


9.4 The complete optimal control problem 


Given the set of equations listed in Chapter 8 and the tasks described in 
Chapter 9 it is possible to fill an optimal control problem, whose complete 
formulation is presented below. Here the vector w contains the set of weights 
defining the relative “importance” of each task. 


Pup 
Tgh? 
T frame 
minimize w Pregf ’ (9.13a) 
os Dregs 
Tswing 
TD yaw 
subject to: 
x= f(x,u), see Eq. (8.30), (9.13b) 
l<g(x,U) <u, see Eqs (8.31)—(8.32),(9.13c) 
ez" x) > dmin; (9.134) 
TCoM,z min < ed LCoM; (9.13e) 
— Mn, < ah” < Mra» (9.13f) 
— Maş < e3 (#pi — #Pr) < Marg. (9.13g) 


Here, the state variables ¥ are those derived in time, while U contains all 
the control inputs. Thus: 


if 
iP | Uf | 
Usp 
SCN se. eles a (9.14) 
LCoM UZB 
t pp | Up 


where the symbol : represents the repetition of the corresponding variables 
for each contact point. 
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Fig. 9.1 A graphic description of the software architecture. 


9.5 Software infrastructure 


The optimal control problem described in Sec. 9.4 is solved using a Direct 
Multiple Shooting method, Sec. 2.3.2. The system dynamics, defined in Eq. 
(8.30), is discretized adopting an implicit trapezoidal method with a fixed 
integration step, as described in Sec. 2.3.1. The corresponding optimization 
problem is solved thanks to Ipopt [Wächter and Biegler, 2006]. 

The pipeline from the problem definition to its solution is implemented 
using the iDynTree::optimalcontrol! library, allowing for easy testing 
of other integrators or solvers. The computation of the Hessian of the 
Lagrangian is done by adopting levi?, a recently developed library. More 
details can be found in Appendices B and C. Fig. 9.1 presents a graphic 
description of the software architecture. 

The walking trajectories are generated according to the Receding Horizon 
Principle presented in Sec. 2.4, adopting a fixed prediction window of 2 
seconds. The horizon is large enough to predict at least one full step. This 
planner is conceived for being adopted in conjunction with a whole-body 
controller or to be used in an off-line fashion. 


‘https: //github.com/robotology/idyntree/tree/devel/src/optimalcontrol 
*nttps://github.com/S-Dafarra/levi 
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Validation and Experimental 
Results 


In this chapter, we present the results obtained when solving the optimal 
control problem described in Chapter 9. In particular, we test its capabilities 
to generate whole-body walking trajectories for a flat ground using the 
model of the iCub humanoid robot introduced in Sec. 1.1. The chapter is 
introduced by Sec. 10.1 with some considerations about the planner. It is 
validated in Sec. 10.2 by visualizing the generated trajectories. Sec. 10.3 
compares the different complementarity constraints in terms of accuracy 
and computational overhead. Sec. 10.4 presents the results of applying the 
generated trajectories on the robot. Finally, Sec. 10.5 concludes the chapter. 


10.1 Considerations 


The optimal control problem described in Sec. 9.4 is built such that (almost) 
no constraint is task specific. As a consequence, it is particularly important 
to define the cost function carefully since the solution will be a trade-off 
between all the various tasks. On the other hand, the detailed model of the 
system allows achieving walking motions without specifying a desired CoM 
trajectory or by fixing the angular momentum to zero. Nevertheless, due 
to the limited time horizon, it is better to prevent the solver from finding 
solutions which would bring to unfeasible states in future planner iterations. 
For this reason, Eq. (9.3a) and Eq. (9.3b) are added, even if the bounds are 
relatively large. 
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)t=0.5s (b) t= 1.5s ) t= 2.5s ) t= 3.5s 


Fig. 10.1 Snapshots! of the generated walking motion. The red arrows 
indicate the force required at each contact point scaled by a factor of 0.01. 
These images have been obtained using the complementarity constraints of 
Sec. 8.2.2, but using the other methods, the result is visually identical. 


Another possible effect resulting from the application of the Receding 
Horizon principle is the emergence of “procrastination” phenomena. Due to 
the moving horizon, the solver may continuously delay in actuating motions, 
since the task keeps being shifted in time. A simple fix to this phenomena 
is to increase the weights w with time, such that it is more convenient to 
reach a goal position earlier. 

Finally, given that the problem under consideration is non-convex, the 
optimizer will find a local minimum. This may result in a sub-optimal 
solution for the given tasks, but this fact does not limit the applicability of 
the results to the robot. During the first iteration, the solver is initialized 
by simply translating the whole robot in the desired position. In successive 
iterations, the solver is warm-started with the solution previously computed. 

All the tests presented in the next sections have been carried on a 7*” 
generation Intel® Core i7@2.8GHz laptop. 


10.2 Validation 


The planner is set up using an integration step of 100ms , while the time 
horizon is 2s. The choice of a large integration step serves for two reasons. 


'nttps://www. youtube. com/playlist?list=PLBOchT-u69w9hJ6BmgPf06r0zWmungOrc 
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(c) Hyperbolic secant in control bounds 


Fig. 10.2 Planned position of one of the right foot contact points using 
different complementarity constraints. The walking phases are recognizable, 
but they are not defined beforehand. The controller does not specify directly 
when a phase begins and ends. 
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(c) Hyperbolic secant in control bounds 


Fig. 10.3 Planned CoM position using different complementarity constraints. 
It is possible to notice a continuous velocity on the x direction. The plots 
appear a little irregular. This may be a consequence of the chosen time step. 
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(b) Dynamically enforced complementarity 


—x —y zZ 
10 
£ 
z 
w SF | 
a 
g 
Be 
g 0 
g 
io} 
a 
8 5} | 
=) 
a0 
g 
< 
-10 $ 1 L 
0 5 10 15 20 
Time (s) 


(c) Hyperbolic secant in control bounds 


Fig. 10.4 Planned angular momentum obtained adopting different complemen- 
tarity constraints. Interestingly, in (a) there is an overshoot of the angular 
momentum on the x axis which is not present in the other two methods. 
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First, it reduces the number of variables used by the optimization problem 
(fixing the time horizon). Secondly, it allows inserting another control loop 
at higher frequency. After each iteration of the planner, the first state is 
used as a feedback to a new planner iteration in a receding horizon fashion. 

When planning, we control 23 of the robot joints. We consider four 
contact points for each foot, located at the vertices of the rectangle enclosing 
the robot foot. Concerning the references, the desired position for the 
centroid of the contact points is moved 10cm along the walking direction 
every time the robot performs a step. A simple state machine, where the 
reference is moved as soon as a step is completed, is enough to generate a 
continuous walking pattern. The speed is modulated by prescribing a desired 
CoM forward velocity equal to 5cm/s. As discussed in the following, we are 
able to obtain very similar results when using any of the complementarity 
methods described in Sec. 8.2.4. 

Fig. 10.1 shows some snapshots of the first generated step, while Fig. 
10.2 shows the position of one of the right contact points. It is possible to 
recognize the different walking phases, though they are not planned a priori. 
Nevertheless, the controller does not specify explicitly when a phase begins 
and ends. Notice that the stance phases obtained using the hyperbolic secant 
method, Sec. 8.2.3, are slightly longer than those produced with the other 
two methods, which are instead very similar. 

Figure 10.3 presents the planned CoM position. Here, it is possible to 
notice that the position along the x direction grows at a constant rate. This 
is a direct consequence of the task on the CoM velocity presented in Sec. 
9.2.2. Notice that the bound on the CoM height, £CoM,z min, is set to half of 
the initial robot height, but such constraint is never activated. The plots 
appear a little irregular. This is probably due to the choice of time step. 
In addition, these are the results of consecutive runs of the optimal control 
problem described in Sec. 9.4. From one iteration to another, the solver may 
find slightly different solutions because of the shift in the prediction horizon, 
causing the irregularities showed in the figures. 

Figure 10.4 shows the planned angular momentum, which is not fixed 
to zero. Although it is limited to 10 kg m?/s, such limit is never reached. 
Interestingly, the trajectories computed with the relaxed complementarity 
constraints showed in Sec. 8.2.1, produce a moderate overshoot in the 
component along x that is not present with the other two methods. This 
is probably due to a little wider motion while swinging a leg. Amongst 
the three, the dynamically enforced complementarity method is the one 
producing the smoothest result. 
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It is worth stressing that none of the tasks described above define how 
and when to raise the foot. By prescribing a reference for the centroid of the 
contact points and by preventing the motion on the contact surface, swing 
motions are planned automatically. Nevertheless, this advantage comes with 
a cost. It is difficult to define a desired swing time and, more importantly, 
the relative importance of each task, i.e. the values of w, must be chosen 
carefully. During experiments, we adopted an incremental approach. We 
added the tasks one by one, starting from T „p and then we gradually refine 
the walking motion by tuning a cost at a time. 


10.3 Complementarity constraints comparison 


In this section, we analyze the differences amongst the complementarity 
methods described in Sec. 8.2. As a measure of performance, we adopt 
the product between the normal force and the height of the contact point 
from the ground, i.e. ;p,-;f,. In other words, we test the accuracy with 
which Eq. (8.6) is satisfied, simplifying the formulation thanks to the planar 
ground assumption. Results are shown in Fig. 10.6. We tune the various 
methods in order to maximize such accuracy. Indeed, parameters that are 
too “restrictive” (e.g. an € too small) may prevent the solver from finding a 
walking strategy because the points are not able to move. On the other hand, 
low accuracy may mean the solver requires a force of considerable magnitude 
when the point is still raised from the ground. Given that robot weighs only 
30kg, even a small force affects the robot’s dynamics. Expectedly, higher 
accuracy involves more computational time. 
The parameters are chosen as follows: 


e Relaxed complementarity: € = 0.015[N - m]. 


N-m 
= 


e Dynamically enforced complementarity: Kps = 20 B ,E=0.2 [ 


e Hyperbolic secant in control bounds: Kf, = 300 B , kp = 350 [4] ; 


Mọ is set to 100N/s for all the components and it is common for all the 
methods. Fig. 10.5 shows the contact points’ height compared to the normal 
force applied to them, for both the two feet and for all the three methods. 
By superimposing quantities from the left and the right foot, it is possible 
to notice the different contact phases. The normal force applied to a point 
drops to zero when this moves, and vice-versa. Nonetheless, it is possible 
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(b) Dynamically enforced complementarity 


Fig. 10.5 Vertical position and normal force of each contact point. The quan- 
tities relative to a point are plotted together with those of the corresponding 
point in the other foot. The dashed lines are relative to the right foot. The 
main differences are visible around the zero axis, where it is possible to notice 
that, depending on the method, some normal force can be requested even if 
the height is not zero. Plots from all the points are shown for completeness. 
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(c) Hyperbolic secant in control bounds 


Fig. 10.5 (cont.) Vertical position and normal force of each contact point. 
The quantities relative to a point are plotted together with those of the 
corresponding point in the other foot. The dashed lines are relative to the 
right foot. The main differences are visible around the zero axis, where it is 
possible to notice that, depending on the method, some normal force can be 
requested even if the height is not zero. Plots from all the points are shown 
for completeness. 
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Fig. 10.6 Product between the vertical position and the normal force of each 
contact point, separated by foot, when using the different complementarity 
methods summarized in Sec. 8.2.4. The black dashed lines indicate the mean 
values. By plotting the result of p; - fz for each point, we show the accuracy 
of each method in simulating a rigid contact. 
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[s] Relaxed | Dynamically enforced | Hyperbolic secant 
Average 11.91 10.53 13.19 
Variance 15.62 13.05 12.39 
Minimum 1.00 0.98 0.77 

Maximum 83.50 68.90 55.08 


Table 10.1 Time performances using different complementarity methods. All 
times are showed in seconds and obtained after 200 runs of the solver. 


to spot millimetric motions even when a force is applied. This measure of 
accuracy is depicted in Fig. 10.6. 

The accuracy is the major difference between the three methods. The 
relaxed complementarity method is the one providing the smallest maximum 
value of ¿pz < ifz- This is a direct consequence of the choice of e. Nevertheless, 
it is the one with the highest mean value. In this case, the dynamically 
enforced complementarity method is the one providing the best results. 
Amongst the three, the hyperbolic secant method is the only one where the 
applied force consistently drops to zero in all the contact points when the 
foot is raised. In fact, with this method we force the normal force derivative 
to be strongly negative when the point is not on the ground. At the same 
time, this method does not prevent the point to move when a force is applied. 
For this reason, this method provides the highest peaks. 

Table 10.1 compares the three methods in terms of time required to find 
a solution. The values are obtained by measuring the time required to obtain 
the 200 points composing the trajectories shown in the figures of this chapter. 
The dynamically enforced complementarity method is the fastest one, while 
the hyperbolic secant one is the best in the other metrics. 

The variance is extremely high for all the methods. We noticed that 
the iterations with the highest duration were occurring when moving the 
reference for the centroid of the contact points. In fact, this is the moment 
in which the planner has to predict a full new step. Since we initialize the 
planner with the previously computed solution, in this instant the optimal 
point is far from the initialization. Hence, more time is required to find 
a solution. This issue can be addressed by providing the planner with a 
more effective initialization, for example by exploiting some of the methods 
presented in Part I. 

Overall, the dynamically enforced complementarity method is the one 
providing the best performances. 
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10.4 Experiments on the robot 


To further validate the output of the planner presented in this part, we tested 
the generated trajectories on the iCub humanoid robot. In particular, they 
are used as a reference for a joint position controller. Since their frequency is 
at 10Hz, we interpolate them to have a reference point every 10ms. Hence, 
the trajectories are replayed on the robot closing the loop only at joint 
level. The robot was able to perform several steps, as shown in Fig. 10.7, 
demonstrating the feasibility of the generated trajectories. 

As a result of Sec. 10.3, we decided to test only the best performing 
method, i.e. the dynamically-enforced complementarity method. In addition, 
compared to results shown in Sec. 10.2, we reduce the forward speed to 
3cm/s, advancing the mean point reference of 6cm at every step. In this 
case, it has been also useful to move the desired CoP position, as anticipated 
in Sec. 9.3.2. In particular, by moving it toward the inner foot edge, the 
robot walks more robustly. 

The trajectories are generated off-line for a span of twenty seconds, after 
200 runs of the solver. They are tested in open-loop, thus limiting the 
maximum velocity achievable by the robot. 


10.5 Conclusions 


This chapter validates what presented in Chapters 8 and 9. It shows that 
walking trajectories can emerge by specifying a moving reference for the 
contact points’ centroid and the desired CoM velocity only. 

The planner considers relatively large time-steps. This enables the 
insertion of another control loop at higher frequency, whose goal is to stabilize 
the planned trajectories. The code is implemented entirely in C++. The 
main bottleneck is represented by the computational time. A single planner 
iteration may take from slightly less than a second to more than a minute. 
This prevents an on-line implementation on the real robot. Nevertheless, the 
continuous formulation of the problem allows the application of techniques, 
like those presented in [Neunert et al., 2018, Farshidian et al., 2017], which 
solve the problem through the iterative application of fast LQR solvers. 

Finally, given the non-convex nature of the problem defined in Sec. 
9.4, it is fundamental to provide a meaningful initial guess. Indeed, local 
minima may bring the planner to “procrastinate”, as anticipated in Sec. 
10.1. An interesting future work consists in adopting Reinforcement Learning 


https://www. youtube .com/playlist?list=PLBOchT-u69w9hJ6BmgPf06r0zWmungOrc 
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(e) t = to + 4s (£) t = to + 5s (g) t = to + 6s (h) t= to + 7s 


Fig. 10.7 Snapshots? of the robot walking using the planned trajectories. 
The planner generates joint references which are interpolated and stabilized 
through a joint position controller. 
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techniques, like [Peng et al., 2018] to warm start the optimization problem. 
In addition, the definition of references can affect the time necessary to 
find a solution. In future works, we will investigate both the adoption of 
faster solvers and the definition of a more sophisticated and efficient way of 
providing references. 
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Epilogue 


The thesis presented the application of predictive techniques to tackle motion 
generation problems. In particular, Part II is devoted to specific applications 
of model predictive controllers and trajectory optimization. Chapter 5 uses 
an MPC controller as an emergency measure, to perform a step in case of 
strong external perturbation. Results are shown only in simulation. In fact, 
this controller is effective only if it can determine a reference for the contact 
wrenches at every control loop. The computational time required is an order 
of magnitude too high to be applied real-time. The MPC controller exploits 
the momentum dynamics to generate a control action, approximating the 
angular momentum to maintain linearity. 

In Chapter 6, we further simplify the centroidal momentum dynamics, 
enabling the adoption of on-line MPC for stabilizing the robot walking when 
controlled both in position and torque control. Desired footsteps are planned 
by approximating the robot as a unicycle. The problem of deciding where to 
step and at what time is solved by framing it as an optimization problem. 

Chapter 7 enriches the model used to plan walking trajectories, planning 
dynamic motions exploiting the robot momentum. We consider explicitly the 
forces applied at the two feet, constrained to produce zero angular momentum. 
The different walking phases are defined beforehand, but their duration is a 
decision variable. Noticeably, by using the generated trajectories we are able 
to reduce up to 20% the maximum torque required to the leading knee when 
performing a large step-up. 

Motivated by the performance gain obtained with a more detailed model, 
in Part III we explore whether it is possible to generate walking motions by 
specifying a minimal set of references, like a desired center of mass velocity 
and a reference position for the centroid of the contact points. We also 
compare how different ways of specifying the complementarity conditions 
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affect the overall planner performances. The generated trajectories are played 
on the robot to validate their efficacy. 

In this last application, computational time represents again a major 
obstacle limiting this application to offline trajectory generation. Summa- 
rizing, experiments on the real robot could be achieved in two cases: the 
considered model is simple enough to swiftly find a solution to the optimal 
control problem, or the model is sufficiently detailed to make the generated 
trajectories meaningful enough to be played on the robot off-line. 

The definition of contacts and their planning plays a majors role when 
generating locomotion trajectories. In all the applications presented in Part II, 
contacts are planned without taking into consideration the state of the robot. 
In Part III, we try to address this issue letting the optimizer determining 
where and when to instantiate a contact. Nevertheless, this was possible 
through a particular definition of cost function, tuning, and at the price of 
a higher computational complexity. The problem of deciding a convenient 
placement of contacts simultaneously to the generation of trajectories is far 
from being solved, especially in this thesis. Things get even more complicated 
and less tractable when several parts of the robot can produce a contact. In 
literature, authors are starting to use new techniques, like Reinforcement 
Learning, to exploit contacts on the whole robot body for generating complex 
motions [Hwangbo et al., 2019], representing an interesting future direction. 

The use of multiple limbs affects the definition and detection of “fall 
state”. For example, the Capture Point concept introduced in Sec. 3.6.2, 
determines a simple and powerful condition to detect when a robot is about to 
fall. It applies when the robot can be approximated as a pendulum, but such 
approximation would be too coarse when multiple contacts are instantiated in 
non-coplanar surfaces. Hence, in this case, the usual fall detection techniques 
would fail. In addition, if we consider the robot pelvis or the torso as viable 
bodies to instantiate a contact, the definition of fall state would complicate 
too. The robot laying on the floor would correspond to a situation where 
it is using multiple contacts, back included, to sustain its weight. It is an 
extreme, but we can argue that such configuration represents a fall state only 
if the robot was not supposed to reach it. As a consequence, we can propose 
the undesired loss of gravitational potential energy as a possible definition 
of “fall state”. At the same time, a fall can be detected when, giving the 
current contacts configuration, we are not able to control the robot in the 
neighborhood of a desired trajectory. To this end, we can exploit a prediction 
horizon, similarly to Sec. 5.5, to determine when this situation occurs. A 
future work consists in extending this concept, thus drawing capturability 
conditions [Koolen et al., 2012] also in the multi-contact scenario. 
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In the Prologue we mentioned that many humanoids fell during the DRC. 
Most probably, none of the methods presented here, if applied during the 
Challenge, would have changed that figure. The tuning and testing process, 
together with the abilities of the team of scientists controlling the robot, play 
the most important role for what concerns robustness. Nonetheless, we hope 
that the development of the techniques adopted in this thesis may help in 


designing controllers with higher degree of robustness requiring a smaller 
developer effort. 


Confusion is better than stupid conclusions. In confusion, there is still a 
possibility. In stupid conclusions, there is no possibility. 


Jaggi Vasudev 
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Appendix 


Equivalence Between a Contact 
Wrench and Four Forces at the 
Vertices of a Rectangular Foot 


Let us consider a foot of rectangular shape, having length l and width d. 
We define the foot frame as located at the center of the foot, oriented such 
that the x-axis is parallel with the side edge (of length l). The z-axis is 
perpendicular to the foot surface. With this choice, the four corners have 
the following coordinates: 


1p = , O), 2= 


l _d 
A (A.1) 
3P =|- » O), ap=|-57? 


We suppose a 3D force ;f is applied at each corner 7. Assuming we have 


a 6D wrench extf = ex f! m applied in the foot frame, we want to 
determine the corresponding value for the four corner forces. In particular, 
the following two conditions hold: 


if tof +3f +4f =enf (A.2a) 
Spat Sear (A.2b) 


Let us start by considering only the forces normal component. These 
have to match the total normal force and the torque applied along x and y. 
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Hence, we can extract the following three equations: 


ife +t2fz+3fz +4fz = ext fz» (A.3a) 
“af: 3f) “(fs fa) = extTx (A.3b) 
ssf: 4fz) sf: fz) = extTy-: (A.3c) 


From the last two equalities, we can obtain: 


3fz = af: + an + Se (A.4a) 
ve = afz 4 an = a (A.4b) 


We can now use Eq. (A.3a) to substitute of, ie. 


2fz = ext Sz = 1f = 3fz —4fz 


into Eq. (A.4a), which leads to 


xtT: axt Ta 
3fz = ext fz itz 3hz 4fz H ae 


_ ext fz ife afz | extTa extTy 


= H H A. 
2 2 2 2d 2l n 
Now, we can also substitute Eq. (A.4b) to obtain 
Do ext Jz afz + ee amd catty 4fz extūTx | extTy 
ade = 2 2° od * 2 
ext Jz ext Ty 
5 T4 i (A.6) 
Thus, we can finally write 1 fz, 2 fz and 3fz as a function of 4 fz: 
itz = 4fz ae ae oe ao (A.7a) 
ext dz ext Tax 
2 z ; A.7b 
of 5 af 7 ( ) 
ext Jz ext 
3fe = if fe ; as (A.7c) 


There may be infinite solutions according to the choice of 4f,. Nevertheless, 
if we impose these normal forces to be positive, the set of feasible solutions 
may considerably shrink. 
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In fact, by considering ;f, > 0, we obtain the following set of inequalities: 


sfe > — P+ SY, (A.8b) 
i < E E Se (A.8c) 
Neate (A.8) 


Let us divide all the inequalities above by ext fz and define ;a@ as the ratio 
of the total normal force each corner is carrying, i.e., ¿œ = (ifz)/(extfz). In 
addition, the CoP measured in the foot frame has the following coordinates: 


Copp Se") -Copya =. 
ext fz ‘i ext fz 
We can finally rewrite Eq. (A.8) accordingly: 
4a > 0, (A.9a) 
CoP. FP; 
e eae (A.9b) 
d l 
1 CoP 
ee y : 
4aS5 T> (A.9c) 
1 CoP, 


Since the CoP is constrained inside the foot polygon, i.e. CoP, € [—1/2,1/2| 
and CoP, € [—d/2,d/2], we also obtain that 4a € [0,1]. In fact, a negative 
aœ would correspond to a negative normal force. On the contrary, any a 
greater than 1 would impose a normal force in another corner to be negative 
to satisfy Eq. (A.3a). We can condensate the bounds as follows: 


CoP, CoP; . {1 CoP, 1 CoP, 
< < ; x 
max (0. 7 ) < 4a < min G 79 i ) (A.10) 


Hence, when the CoP is in position 4p, the two bounds will coincide and 
equal to 1. If the CoP is in the opposite corner, 1p, then 4a needs to be 0. 
In all the other cases, the bounds do not coincide, obtaining infinite possible 
values for 4a. A simple solution is to choose 4a as the mean point of the 
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bounds. This leads to the following: 


CoP CoP, : 1 CoPy 1 CoP, 
(max (0, T 7 ) -min (4 TE 7 )) 


44a = 5 , (A.11a) 
Py P 
a= 404 2 | a 2, (A.11b) 
1 CoP, 
a A.11 
2Q 4Q 9 A 5 ( c) 
3a = 4a ; ee (A.11d) 


Interestingly, with this choice, when the CoP is in the center of the foot, all 
ja are equal to 1/4. 

Once all the normal forces have been identified, the other components, 
ie. ifs and ; fy, can be obtained via pseudo-inverse starting from Eq. (A.3). 
In particular, we have to satisfy the following three conditions 


ite + 2fx + 3 fx + 4fe E ext Jz, (A.12a) 
ity + 2fy + 3fy + 4fy = ext fy, (A.12b) 

Sie aaa be A.12 
a iPy ipa extTz5 ( c) 


using eight unknowns. Hence there are infinite solutions. Notice that friction 
constraints may not be satisfied by a corner force, even if the original contact 
wrench does. 
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Appendix 


Jacobians of Kinematic and 
Dynamic Quantities 


In this appendix, we present the partial derivatives of several quantities 
exploited in Sec. 9.4. As discussed in Sec. 9.5, the optimal control problem is 
turned into an optimization problem and solved through off-the-shelf solvers. 
Such tools need to evaluate the partial derivatives of cost and constraint 
functions. To this end, we define an expression for those derivatives which 
are not trivially computable, usually involving non-linear functions of the 
variables involved in Eq. (9.14), namely in ¥ and U. In this appendix, 
we use extensively the velocity trivializations presented in Sec. 3.2. The 
dissertation shown here complements the results of [Carpentier and Mansard, 
2018]. 


B.1 Partial derivative of a rotated vector 


In this section, we face the partial derivatives computation of expressions 
involving a rotation matrix. A possible example is given by Eq. (8.5a), where 
we seek to find the partial derivative of IRpP vz,B With respect to the base 
quaternion fpg. The computation of Foon T Rpg would result in a tensor, 
difficult to be handled. Indeed, it is easier to consider the case where the 
rotation matrix is multiplied by a generic vector x € R?. 

Define R € SO(3) as a generic rotation matrix and p € H the correspond- 


ing quaternion. We can rewrite the product Ry by using the Rodrigues 
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formula [Murray, 2017]: 


RX = X + 2pwrx + 2r^r^x, (B.1) 
where 
Pi 
r= |p; (B.2) 
Pk 


contains the imaginary part of the quaternion and pw is its real part. Thus: 


ð 
Bon = 2r^x, (B.3a) 
Oey = — Dog + ae (r^) r^x + A (r^) X (B.3b) 
Or o Or Or : l 
= — 2p X^- 2 (r^x) ^ — 2r^x^, (B.3c) 
ð 
Bp X = [2r^x —2pu X^ — 2 (r^x)^ — 2r^x^]| : (B.4) 
Here we extensively used the property (æ)^y = —(y)^z. 


B.2 Partial derivatives of a non unitary quater- 
nion 


The quaternion modulus constraint can be broken across solver iterations. 
In fact, a generic solver can search over an unfeasible region while reaching 
the optimal solution. When computing the partial derivatives involving the 
base quaternion, we are implicitly assuming that it expresses a rotation (see 
for example Eq. (B.1)). This is true only if it has unit modulus. In order to 
satisfy this assumption, we have to normalize the quaternion before using it: 


N P P 
P => — = . 
liell p'P 


(B.5) 
This means that in order to compute the partial derivatives with respect to 
the not unitary quaternion, we have to perform the following operations: 


aC) _ 20) ap” 
Op = OpN ðp ` 


(B.6) 


The first component, on is the partial derivative of a generic function 
considering a normalized quaternion. For what concerns the second part 
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instead: 


apn 1 T\ T 
iya , B.7 
TE (p p) pp (B.7a) 
1 T | 
= eee p= l B.7b 
(pe'p) llell (( p) B ee 


Thus, Eq. (B.7b) must be multiplied on the right of all derivatives which 
consider an unit quaternion. 


B.3 Partial derivatives of adjoint matrices 


In this section, we present the partial derivative of an adjoint transform 
with respect to joint values. Let us consider two frames P and C connected 
through a single joint j. The time derivative of the adjoint matrix ? Xc is 
as follows [Traversaro, 2017, Eq. (2.36)]: 


P Xo = P Xo (Vex) g (B.8) 
T 
The operator x is defined such that, given spo = |Tv} o wh we 
> P,C P,C 
obtain: 
CoA C p^ 
fn | YP VPRO 
SP.CX := e oun | (B.9) 
From Eq. (3.31) we have 
“Veco = “spc ŝj, (B.10) 
thus we finally obtain 
o 
— P Xo = P Xo (spox) : (B.11) 
Os; 


Let us consider now a more generic case where P Xp defines a transfor- 
mation between two frames, D and E, rigidly attached to the robot. The 
partial derivative with respect to s; depends on whether joint 7 belongs to 
mp(E). If this is the case, we have 


PXp=?Xp? Xo Xz, (B.12) 
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where only the transformation ” Xç depends on s;. As a consequence 


ð 
7a; XE =” Xp” Xc (spox) Xp, 
Sj 


(B.13) 
= D Xo (spox) CX. 
If instead j € tp(£) the partial derivative is null. To summarize 
Xo (Cspox)°Xp, iff €mp(E 
EAS = c\ spc E, ifj €mp(F), (B.14) 


Sj 0 otherwise. 


Analogously, it is possible to define the partial derivative of the 6D force 
coordinate transformation pX. In particular, 


ð C\— CC o% 
a X ) = pX SPCX . (B.15) 


The operator x* is defined such that, given © 


obtain [Traversaro, 2017, Eq. (2.50)]: 


T 
= T T 
SPC = [Cupo wko] we 


(OPINN 

C so , wWpo 03x3 

SPCX 5 | CLA C.A 7 (B.16) 
Upc “PC 


B.4 Partial derivatives of transformation matrices 


The relative transform P Hg determines a change of coordinate for a generic 
vector “x, such that: 
Dy = P Hg?x. (B.17) 


Assuming “x to be constant, this equation would depend only on the joint 
values connecting the two frames. In order to compute the partial derivative 
with respect to a generic joint value s;, first we can rewrite Eq. (B.17) as 


Py =P Rp" x + Pop. (B.18) 


B.4.1 Relative position 


The partial derivative of a relative position with respect to joint values can be 
obtained through the relative Jacobian. Denoting ” J D,E as the left-trivialized 
Jacobian, the relative left-trivialized velocity is equal to 


=" Jp ES, (B.19) 


from which we obtain 


ôP og Os 
D; D D : E 
—>R [a 0 | Vein oe 2a B.20 
OF E 3 3x3 DES ðs ðt ( ) 
hence ə 
5g OF = P Rg [1s 03x3] > Ip p. (B.21) 
Assuming j € tp(£), the partial derivative w.r.t joint j consists in 
ð 
Fe; OF = Rr [ts 03x3| EXc°spo. (B.22) 
j 


B.4.2 Relative rotation 


The partial derivative of P Rg? x instead requires some manipulation. In 
particular it is possible to exploit the result of Eq. (B.14) by noting in Eq. 
(3.20) that the upper left block of the adjoint matrix contains the rotation 
between the two frames. Thus, we have 


E 
PRg?x= (1s 03x3]? Xe |0% |> (B.23) 
03x1 
or alternatively 
1 
P Rg”x = E 03x3] DPXpg l 3 | Ey, (B.24) 
3x3 
Assuming j € tp(£), we obtain 
W Dp En, _ D C C Px 
Reg“x= |l 03x3| “Xo (“spcx )~ Xe , (B.25) 
Os; : 03x1 
allowing us to build 2? Rp’ x by column. 
B.4.3 Transformations relative to the inertial frame 
Eq. (8.24) presents a coordinate transformation of a (fixed) vector '°;p. 


The transformation matrix 7 Hj, depends on the base pose and on all the 
joints connecting the foot frame to the base. It is possible to split this 
relation in the following two equations: 


ip =" Hp” ip, (B.26a) 
Bp E B Hrot ip. (B.26b) 
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Eq. (B.26b) depends only on joint positions and its partial derivative can be 
computed as showed in Sec. B.4. On the other hand, Eq. (B.26a) depends 
only on the base position Fpp and its orientation, expressed through the 
quaternion pe. By rewriting it as 


ip =*Re? p+ pp, (B.27) 


the derivative with respect to the base position can be computed trivially, 


G] 
ôT pp 


while (7Re* ip) can be computed as in Sec. B.1. 


B.5 Partial derivatives of the base velocity 


The left-trivialized base velocity is defined as 
B Burp 
Vip=|p.” |-> (B.28) 


While the linear part is contained in the optimization variables, the angular 
part is not. In fact, the quaternion time derivative is used instead inside U. 
As shown in Eq. (8.29), the base angular velocity depends linearly on the 
quaternion derivative, while the map G depends on the base quaternion 7 pp. 
For simplicity, let us drop the superscripts and define w as the left-trivialized 
angular velocity. r is the imaginary part of the base quaternion and pw is 
its real part while r and pw compose the quaternion derivative. Then, from 
[Graf, 2008, Sec. 1.5.4], we have 


w = 2 (pwr — bur — r^t). (B.29) 


We finally obtain 


Zu = Ei aN pwt] , (B.30) 


B.6 Partial derivatives of a link velocity 


Let us consider the left-trivialized velocity of a generic link L. This element 
depends on a series of quantities, namely: 


e joint positions s, 
e joint velocities s, 


e base velocity ? Vip. 
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The partial derivative with respect to å and ? Vz,p are trivially obtained from 
the corresponding left-trivialized Jacobian. Contrarily, the partial derivative 
with respect to joint values requires some manipulation. 

At first, we compute the partial derivative of a velocity relative to the 
robot base (hence not depending on base quantities). Subsequently, the base 
velocity is added. In Eq. (3.31), the matrix KX msi depends on (a subset 
of) the joint variables. By considering one single joint at a time, the partial 
derivative of “Vp L is: 


ð o ; 
a Ven = 5 L Xe (°°) PX a Sk, (B.31) 
J kenp(L) J 


where C = mp(j), i.e. the child of joint j and P = Apg (j) its parent. By 
applying Eq. (B.11) we obtain: 


ð . 
——"Vp L = `> (‘xe (CXP"sc,px) Xmen] Sk. (B.32) 


Os; kenp(L) 


Note that, compared to Eq. (B.11), the order of P and C is inverted. 
We can restore the previous order by noting that 


spc = -Psc P, 
a 2 5 (B.33) 
spc = “Xp spc. 
In addition, it is possible to exploit the following property: 
((?Xe°sre) x) = P Xo (spox) C Xp. (B.34) 
so that 
C Xp”so px = —°spo x CXp. (B.35) 
As a consequence, we obtain 
o 
5 Ven = y (‘xe (“src x Xp) PX msa" ®s) Sk, 
J kémp(L) 
Sa z ("Xc spo x CX math)? s) Sk. (B.36) 
ken p(L) 


Joint j must also belong to mg(L) and k < j, i.e. joint j must be in the 
chain from k to L (using B as base), otherwise the matrix Bearer does not 
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depend on s;. Henceforth, the summation is on the part of the kinematic 
tree which starts from B and ends in C. This set corresponds to 7p(C). 
Thus, we have: 


Lr 


Ass VBL = —-X0°spc x `> (K E) Sk. (B.37) 
J 


kenp(C) 


Applying Eq. (3.31) to the summation on the right hand side, we can write 
ð 


a Var = —"X0°spo x Ve c. (B.38) 
J 


Let’s consider now the velocity of link L with respect to an inertial frame 
T, obtained through Eq. (3.29): 


Ap 


ð 0 
aa. VIL = —! Xp? Vip + —! Vz L. (B.39) 
J 


Os; Os; 
The second part is obtained through Eq. (B.38), while the first is computed 
as in Eq. (B.14) obtaining 


-l Xcľspc x “Xp? Vip, j E€Tg(L), 


À (B.40) 
0 otherwise. 


O 
a he ee = 
Sj 


Finally, substituting Eq. (B.38) and Eq. (B.40) into Eq. (B.39), we obtain 


ə LXofs x (-°v, e enai 
Iy C“ SPC Ac), j ETB(L) (B.41) 
Ss 


Os; 0 otherwise. 


B.7 Partial derivatives of the CoM position 


Eq. (8.26) imposes that com should match the CoM position computed 
from forward kinematics, CoOM(‘pz,Zpp, s). In order to compute the partial 
derivative of such function w.r.t base and joint variables, we can resort to 
the CoM jacobian Jcom. We can split Jcom as follows: 


Icom = | Teena JooM Jm] , (B.42) 
such that 
E . 
i PB 
VCoM = i JM Tecil Twr,B - (B.43) 
8 
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where Ucom is the CoM velocity relative to the inertial frame Z, expressed 
in a frame located on the CoM having orientation parallel to Z (mired 
representation). Thanks to this choice of frame, the CoM jacobian Jcom 
maps the base position time derivative 7 pz, the right-trivialized base angular 
velocity 7wz,p and the joint time derivative into the CoM velocity. Thus, 
we have: 


ð l 
5; C°MČ pe, Tps, s) = JÈM, (B.44) 
Jp; CoM( pg, Ipp, s) = JÈ m: (B.45) 


For what concerns the rotation part, we need the partial derivative with 
respect to the quaternion. Given that we are considering a Jacobian in mixed 
representation, we need to map a quaternion derivative into an angular 
velocity expressed in an inertial frame (right-trivialized). This relation is 
defined in [Graf, 2008, Sec. 1.5.3]: 


Twr pg =2E7 pp. (B.46) 


Finally we have 


o 
BE pg OOM pa. Len, 5) = Wey E. (B.47) 
Alternatively, we can obtain these partial derivative by starting from the 
following relation: 


com = Hp? poo; (B.48) 


where P pcom € R? is the CoM position expressed in base coordinates. In 
this way, the partial derivatives w.r.t base position and quaternion can be 
computed as in Sec. B.4.3 and we are left with the computation of 2% PCoM: 
Exploiting the results of Sec. B.4 and the CoM definition of Eq. (3.41), the 
partial derivative w.r.t. joint 7 is the following 


ð Mi ; 

Fe; POoM = > = (PR: [1s 03x3| 'Xo°spot 

Sj — m 

i:jem p(t) 

(B.49) 

tu 
+ EE 03x3| P Xo (spox) X; | ‘| 

03x1 

where the summation is over those joints having j in their path to the base. 
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B.8 Partial derivatives of centroidal momentum 


In this section, we analyze Eq. (8.27). The partial derivative w.r.t the 
(left-trivialized) base velocity and joint velocities consist in selecting the 
corresponding columns of the Centroidal Momentum Matrix Jemm, first 
presented in Sec. 3.5. In the following, we examine the dependency from the 
joint positions and base pose. 


B.8.1 Partial derivative with respect to CoM position and 
base pose 


Equation (3.45) depends upon the base pose and CoM position only through 
matrix ax? We can rewrite Eq. (3.45) as: 


ch = @X* XP? gh (B.50) 


where gh is the centroidal momentum expressed in base coordinates. We 
can expand Eq. (B.50): 


13 0 I Rpg 0 pgh” 
sh = ; B.51 
G En ‘| spe tha Ea ( ) 

obtaining 
T v 
Rp 0 Bh 

ah = ; B.52 
G peg = rou Re a Ba ( ) 


By substituting acom =* Re? pcom +% pp, exploiting the additivity property 
of the A operator and the rotational invariance of cross products, we obtain: 


TR 0 h” 
sh = 4 ne B.53 

G ane, ths a ( ) 
Hence the centroidal momentum gh does not depend on the base position: 


o 


—=— Gah = 0. B.54 
Opp? ( ) 


In order to compute the derivative w.r.t. the CoM position and the base 
quaternion, we can express the linear and angular momentum separately: 


gh” =*Reph’, (B.55a) 
gh” = =" ph neh? +7 Reph” (B.55b) 
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For what concerns the partial derivatives of the rotation matrix with respect 
to the base quaternion, we can use the results of Section B.1. Regarding the 
CoM position, we have: 


Fpa = a - (B.56) 

O. 

The partial derivative w.r.t £CoM, i.e. 2° pcom can be easily computed by 
Ox 


composition. Otherwise, it is possible to ‘exploit Eq.(B.49) to compute 
directly the partial derivative w.r.t joint variables for this component. 


B.8.2 Partial derivative with respect to joint values 


Let us consider now the partial derivative w.r.t joint values: 


o o 


o 
= z xB a xB 


Os; 
The first derivative can be computed from the results of Sec. B.8.1. In this 
section we focus on a (Bh). 

The centroidal momentum expressed in base coordinates is the sum of 
all the link momenta expressed in the same frame. Hence: 


goh = d n (sxi) LV; + Yo oX'hs, (‘Vz.:) (B.58) 


The derivative Bey (pX") is obtained as in Eq. (B.15), while Eq. (B.41) 
presents the partial derivative of the link velocity. Thus, we can rewrite Eq. 
(B.58) as follows: 


ð E AT T 
PR = `> BX @ospoxX*oX'T' Vri- BX’ Xo spc x (CVi.c) 
J i:jemp (i) 
= pX°[spcx* `> cX’L’ Vri | + 
ijerg(i) 


= `> pXx'l;'Xc | “spc x (CVic) (B.59) 
i:jemp (i) 


with j connecting the child link C to its parent P and the sums being on 
the links which have j in their path to the base. 
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B.8.3 Partial derivative with respect to base and joint veloc- 
ities 

Eq. (3.45) depends upon the base velocity P Vz g through ‘Vz. By using 

Eq. (3.29), we can write: 


a... 
=a Vi = ' XB, B.60 
OP Vi B E 2 ( ) 
hence ə 
-h— -yB ipi 


a 

This corresponds to the first 6 columns of the CMM matrix. The other rows 
corresponds to the partial derivative with respect to joint velocities. Given a 
generic joint j, the partial derivative w.r.t. $; is: 


a S 
az, ch = GX? XO px" Xo“spc, (B.62) 
J 


jem p (%) 


where we used the relation presented in Eq. (3.31). 


B.9 Partial derivatives of a quaternion as rotation 
error 


Sec. 9.3.1 adopts a task which minimizes the rotation error of a generic 
frame attached to the robot. Since the reference orientation is expressed in 
the inertial frame Z, this error depends on the kinematic status of the robot, 
viz. joint positions and base pose. 

To compute the partial derivatives of the function quat (4 Reame)s let us 
start by considering the time derivative of the quaternion error Perr as in 
[Graf, 2008, Section 1.5.4]: 


; 1 
Perr = 59 (Perr) Wer (B.63) 


where Wer is the left-trivialized angular velocity. Perr is the quaternion 
associated to the rotation error matrix 7 Rgame = TR) eS Rieme: while the 
left-trivialized velocity Wey is obtained as: 


. Vv 
Werr = (Pan Rinne : (B.64) 
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By substituting we have 


f 1 5 z y 
Perr = 59 (Perr) (Raa Rinne) , (B.65a) 

1 T T p* d Ip» T i 

z 2 G (Porr) Rframe Rframe dt ( Rfame Piste ) (B : 65b) 
1 T T p* T pel d T x 

= 2 G (Perr) Rerame Riraiie Rirame dt ( Risse) ) (B . 65c) 
1 : Vv 

= 59 (Perr) G Hee Rian) ) (B.65d) 
1 “3 

= 59 (Pers) AE oP aries (B.66) 


with ade ee the left-trivialized frame angular velocity. This velocity can 
be computed using the bottom three rows of the corresponding left-trivialized 
Jacobian “J, where we drop the subscripts for the sake of simplicity: 


vA . 


PB 
aE oer prays =" TS Bur pg ) 
S 
(B.67) 
J hactio 
= [ege w Jur ag) 2G (7 pz) ÈB : 
S 
Henceforth, the following derivatives are easily obtained: 
a 1 7 
ar Perr - 35 err aJe B. 
Po = F9 (Pen) (B.68) 
0 ae T; 
err = err ega 5 B.69 
PA G (Perr) GC pp) (B.69) 
a 1 ae 
a Per = 35 err “J5. B. 
75? 59 (Perr) WI (B.70) 
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Methods for Computing the 
Hessian of the Lagrangian Involving 
Kinematic and Dynamic Quantities 


In this appendix, we compute the Hessian of the Lagrangian related to the 
optimal control problem presented Sec. 9.4. In particular, in Sec. C.1 we 
introduce few mathematical tools useful to simplify the calculations. Most of 
the derivatives require the mechanical application of the results presented in 
Appendix B, thus they are not presented. In this appendix, we show those 
derivatives which are not trivial. 


C.1 Preliminaries 


Let us introduce few concepts that are helpful in the computation of the 
Hessian of the Lagrangian. 


C.1.1 Lagrangian Hessian by columns 
Let us modify the Lagrangian presented in Eq. (2.41) as follows: 
L=T(X)+A' k(x), (C.1) 


where we stacked together both equality and inequality constraints and their 
respective multipliers. Hence the Hessian, i.e. its second partial derivative 
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w.r.t. x, is obtained as 


a a o? 
zaf = Fal (x) 2 Aga 0o. (C.2) 


The second part of Eq. (C.2) involves a linear combination of the partial 
derivatives of each constraint row, using the Lagrangian multipliers as co- 
efficients. Nevertheless, as shown in Appendix B, we are able to define the 
partial derivative of kinematic and dynamic quantities w.r.t a single variable, 
in an easier way compared to the derivative of a specific row w.r.t all the 
variables. For example, we are able to define the partial derivative of the 
velocity of a link w.r.t a single joint, but not the partial derivative of the 
az-component of the velocity w.r.t all the joint variables. In other words, it is 


easier to define analytically 
o o 
— h(x 
Oxi (2 l ) 


rather than Zhi). With this rationale, we can notice the following 


Se= Erot [ATAO ATRO = ATO], (C3a) 
=VxP(x) HAT (VR) e (Vah) ®], (C.3b) 


where the apex (j) indicates the jth column. This highlights that ae h(x) = 
(Vx h(x))”. Hence, we have 
82 2 
L= 2 
OxiOxXj  OXiðXj 
thus building the Hessian by elements. 


This simple rearrangement allows us to reuse the results of Appendix B, 
and focus only on the computation of a generic column of each constraint 


ð i 
T(x) + A'S (VxhA(x)), (C.4) 


Jacobian, i.e. (Vx h(x)). In case of sum of two matrices, the derivative of 
a column is simply the sum of the derivatives. In the following, we analyze 
the derivative of a column resulting from the product of two matrices. 


C.1.2 Partial derivative of a matrix product’s column 


Consider two generic matrices A and B of suitable dimensions, both depend- 
ing on x. Our goal is to define 
ð ; 
— (AB). CS 
py (AB) (C5) 
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The j-th column of the product is equal to 


(AB)” = A(B)®, (C.6) 
i.e. it is equal to A times the j-th column of B. Hence 
o ; o . o . 
| (AB)\® = A> (Bo ~“ (A) B®). C.7 
pe A ABO) A (C.7) 


In order to compute the second term, we can write the product between A 
times and the j-th column of B as 


A(B) =X bijA®, (C.8) 


where b; j is the element (i,j) of B. In other words, the product between 
matrix A and vector B®?) can be seen as the linear combination of all the 
columns of A, using the components of BY) as coefficients. We can exploit 
this property to finally write 


Z (4B) -a2 (B) + boa (4%). (C.9) 


It is worth mentioning the following property: 
o ; o 
yera , 1 
J (A ) ej Dx, (A) e;. (C.10) 


This is trivially obtained by noticing that the partial derivative of the i-th 
column of A w.r.t x corresponds to 


Z (4) = fag (49) 4 (4%) #(a%)], (C.11) 
while 
wa |z (aM) 2 (48) ... a (aww), (C.12) 
J 


with na being the number of columns of A. Hence, the j-th column of 
x (A) corresponds to the i-th column of the partial derivative of A with 
respect to 7. We can exploit this property when selecting a column of Eq. 


(C.8)’s derivative: 


= — (A) B®, (C.13) 


Hence, it is of no surprise that 


a , 
D tug (A®) = [ae A) BO (A) BO... 32 (A) BY]. 


This means that the second component of Eq. (C.7) also corresponds to the 
derivative of A by all the components of x, each one multiplied by the 7—th 
column of B. 


C.1.3 Derivative of a skew symmetric matrix’s column 


Given x € R, the skew symmetric matrix associated to it is the following: 


0 —73 T2 
x =| 23 0 =z]. (C.14) 
— T2 Tı 0 


The partial derivative of each column are easily computed: 


a 0 0 0 
ag (2) 20: 20,6214 
a 0 —1 0 
5 0 0 -1 
a (ena) I son ie) 4 (C.15) 
% 10 0 
0 10 
2 (an) =- |1 00 
* 0 00 


Interestingly, these corresponds to the three-dimensional Levi-Civita symbol, 
or permutation symbol. 

The results obtained in the previous sections represent building blocks 
for computing the Hessian of the full optimization problem. In particular, 
we can notice that most of the derivatives shown in Appendix B are sum or 
products of matrices. 


C.2 Partial derivatives with respect to joint vari- 
ables 


In the following, we show the second order derivative of the quantities which 
may not be trivial or mechanically obtained using the properties showed 
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above. As an example, the second derivative of a rotation matrix with respect 
to the associated quaternion can be computed applying the properties from 
the two subsections above, and it is not developed here. On the contrary, we 
show the second derivatives of an adjoint matrix and of centroidal quantites 
w.r.t joint variables. 


C.2.1 Second partial derivative of a adjoint matrix 


The adjoint matrix is the most common object in the derivatives of Appendix 
B. The derivative of a generic adjoint matrix with respect to joint 7 is defined 
in Eq. (B.13). First of all, by noticing that 


a a 
ooa) sda 
jOSk SkOSj 


(C.16) 


we can assume, without loss of generality, that k € mp(C). In other words, 
we assume joint k to be either equal to j or closer to the base link D. With 
this assumption, Xg does not depend on sz. Define y and 8 as the parent 
and child link of joint k, respectively. We finally obtain 


82 
O8;,08; 


DXpg = D Xo (°s,0%) °Xc (spox) C Xp. (C.17) 


C.2.2 Second partial derivative of the center of mass position 


We focus on the center of mass position measured in the base frame, thus 
removing the dependency from the base pose. Starting from Eq.(B.49), we 
can rewrite it as 


o Mi l; 0 ; 
=P poon = |1; 03x3| `> = ex : 7 ‘Xol spo+ 


Os; eG 03x3 03x3 
B ‘pc M 

+P Xo (“srcx) a 

3x1 
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where we substituted the rotation matrix P R; with the corresponding adjoint. 
We move the selector and the transform ? Xç outside of the summation: 


ð Mi 1 0 
—P poom = [is 03x3” Xc > 2 |X| S] Xo spco+ 
Os; ore 03x3 03x3 
H TB 


i 
+ (spox) Cx, | ia 
03x1 


As Sec. C.2.1, we compute the partial derivative w.r.t another joint k, such 
that k € mp(C). Also in this case, there is no loss of generality. Under this 
assumption, “X; does not depend on sz. In fact, the summation is over 
all the links which have j in their path to the base, and, because of the 
assumption, the subtree between C (the child link of j) and ¿i cannot contain 
joint k. Henceforth, only matrix P” Xç depends on joint k and the derivative 
corresponds to: 


m; 
[1s 03x3] 2X (°s,.0%) î Xo `. ca OX 


B = 

PCoM = 

sjðsk D m 
ijEnrp(i) 


(C.18) 


l3 O3x3/ iy © C Cy | PcoM 
` Xco“spco +|“ spcx) “Xi 
03x3 03x3 03x1 
If k ¢ mp(C) and j ¢ Tg(0), i.e. joint j and joint k belong to two different 
subtrees (depending on the choice of base), then the derivative is null. 


C.2.3 Second partial derivative of the centroidal momentum 


Starting from Eq. (B.57), we write the second order partial derivative of the 
centroidal momentum as follows: 


o? o? 


o o 
ðsjðsk ah O08; 08, 


(ex”) Bh + Is; (ax”) Ta (ph) + 


(C.19) 


+ 2 (cx) ae (ph) + aX? 
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sk ðsj ðsjðsk (Bh). 


While ay (ex?) ae (gh) and ae (ax®) A (Bh) can be easily computed 
from the results of Sections B.8.1 and B.8.2, in the following we focus on the 


first and fourth terms. 
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We can rewrite GX? as 


L 
gX? = | et | EÈ | (C.20) 
PCoM 3 


Hence, by exploiting Eq.s (C.18) and (C.15), we can compute its second 
order derivative w.r.t joint values. 

For what concerns the second order derivative of the centroidal momentum 
expressed in the base frame, we can use the same assumption we do in 
Sec. C.2.2 about the relative ordering between joint j and joint k. Asa 
consequence, the matrices cX’ and Xc do not depend on sz. Hence 


2 
mI = 7 (»x°) aga 2 oX'ha. (Vra) + 
: injemp (i) 


= y coX'Ii Xo Capo x =~ (Vie) . (C.21) 
injemp (i) Osk 


The three derivatives included in Eq. (C.21) are readily computed as in Sec.s 
B.3 and B.6, obtaining the following result: 
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ee eee a x") xe C x XIX 
ðsjðsk” B Sp, i SP,C D C i oj + 


“jem p(t) 


= >A CX Xo spox C Xo’ sg0 x °? Vz o.(C.22) 
ijEnp(i) 


As for the center of mass position derivative, if k and j are on two 
different subtrees the derivative is null. 


C.3 The levi software library 


Appendices B and C present derivatives of several quantities which can be 
used to generate complex expressions. Nevertheless, combining these equa- 
tions “by hand” may be complex and error prone. For example, consider the 
case where several matrices are multiplied together. Their derivative would 
involve many combinations of the rule showed in Eq. (C.9), complicating 
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the code implementation. For this reason we implemented a library able to 
automatize this process. 

The levi software library is based on the concept of evaluable. It is an 
object comparable to a black-box. Nevertheless, we can retrieve its value and 
the derivatives. Common mathematical operators are evaluables as well, 
holding the expressions involved in the operation. expressions are objects 
containing a shared pointer to an evaluable. This allows us to easily handle 
the lifetime of an evaluable: if no expression points to it, the evaluable 
is deallocated because it means that this element is not involved into any 
mathematical expression. 

For example, consider the following relation: 


X:=Ax(B+C). (C.23) 


A, B and C are expressions pointing to evaluables defining their value. 
They can be constants or complicated custom functions. Then, (B + C) is 
also an expression pointing to an evaluable which contain both B and C. 
Finally, X is again an expression pointing to the evaluable performing 
the product between the two expressions A and (B+ C). 

The derivative of an evaluable can be obtained by specifying the column 
to be derived and the variable with respect to which the derivative has to 
be retrieved. The output of this call is another expression containing a 
pointer to the evaluable corresponding to the derivative. This allows us to 
combine derivatives seamlessly with other expressions. For example, when 
requesting the derivative of the sum of two expressions, we obtain yet another 
sum between the respective column derivative of the two addends. Hence, we 
can easily obtain the symbolic expression starting from complicated equations. 
At the same time, given the black-box paradigm, entire formulas can be 
substituted with a single custom evaluable fastening the evaluation of the 
expression. This mechanism allows us to use the analytical formula of a 
derivative when available, or to compute it by symbolic manipulation. 

In brief, levi allows evaluating the derivative of complex expressions 
by mixing symbolic computation with custom made evaluables. 
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